Thursday, September 23, 2010

trackDb

UCSC genome browser
Kent source tree

Get it here: http://hgdownload.cse.ucsc.edu/admin/jksrc.zip
Or view it online: http://hgwdev.cse.ucsc.edu/~kent/src/unzipped/

Important README files: kent/src/product/README.xxx
System design

The genome browser is composed by following parts:
For service
server-side CGI binaries and HTML files
hg.conf configuration file
For administration
utility tools
.hg.conf file
data
stored in Mysql
stored in external files (optional, such as bigBed, bigWig, ...)
Design principle:
Each species will occupy a Mysql database. Inside each database:
each track correspond to one table, and each track belongs to a group
one trackDb table exists for one species, used to describe tracks
each record of the data in trackDb defines one track
one grp table to hold group info
one hgFindSpec table
Apart from databases for individual species, special databases exist:
hgFixed
hgcentral

Constructing new database

Download kent source and compile everything. See src/product/README.trackDb on how to compile everything.
Create hg.conf file at /cgi-bin/ directory. Sample file can be found at: http://genome-test.cse.ucsc.edu/~kent/src/unzipped/product/ex.hg.conf
Create .hg.conf file at your home directory:
$ cat > ~/.hg.conf
db.host=127.0.0.1
db.user=hguser
db.password=hguser
db.trackDb=trackDb
db.grp=grp
central.db=hgcentral
central.host=127.0.0.1
central.user=hguser
central.password=hguser
$ chmod 600 ~/.hg.conf
Create mysql database, make sure all folders at mysql directory are with user/grp of mysql.mysql. If not, will throw "errno: 13" when trying to modify it!!
To add new tracks:

Refer to general guide: kent/src/product/README.trackDb
generate track data in appropriate format (http://genome.ucsc.edu/FAQ/FAQformat.html)
Create a table to hold the track data. Need to identify the format of the track data file, and use corresponding loader program to load it. Following example is for bed format:
$ hgLoadBed dbName trackTableName file.bed
Realistic example: $ hgLoadBed -noBin -bedGraph=4 hg19 track_name data.bedGraph
Loader program source locates at: kent/src/hg/makeDb/
Create/update the trackDb:
Compose a new trackDb.ra file (by editing the old one) with configuration section for the new track
Example *.ra files: src/hg/makeDb/trackDb/[organism name]/[genome version]
Information on trackDb options: src/hg/makeDb/trackDb/README. Also see next section.
Compose a makefile at the place where trackDb.ra file resides. Could be like:
trackDbSql=/home/cgs/twlab/xzhou/kent/src/hg/lib/trackDb.sql
DB=hg19
all::
~/latest/hgTrackDb . ${DB} trackDb ${trackDbSql} .
Run $ make all to update the trackDb table.
Also tracks could be in bigBed or bigWig formats. See: http://www.mail-archive.com/genome@lists.soe.ucsc.edu/msg00924.html

hgsql hg19 -e 'drop table if exists myLocalBigWig; create table myLocalBigWig (fileName varchar(255) not null); insert into myLocalBigWig values ("/gbdb/hg19/bbi/myLocalBigWig.bw");'
About trackDb.ra files:

Example files for human hg19 can be found at: kent/src/hg/makeDb/trackDb/human/hg19/
use blank lines to separate tracks
each line begins with an attribute name and value, separated by space
Configurations
field: track
track trackName [override]
field: type
a lot of types there...
Track height: use maxHeightPixels 128:32:16
Track color: color r,g,b
Sub groups: subGroup1 sampleType Sample_Type fetalK=fetalK CD34=CD34 ....

No comments: