<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Back of a Stamp &#187; ISMB2009</title>
	<atom:link href="http://www.cassj.co.uk/blog/?feed=rss2&#038;tag=ismb2009" rel="self" type="application/rss+xml" />
	<link>http://www.cassj.co.uk/blog</link>
	<description>The sum total of interesting things I know</description>
	<lastBuildDate>Mon, 02 Aug 2010 12:24:06 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Setting up Galaxy</title>
		<link>http://www.cassj.co.uk/blog/?p=359</link>
		<comments>http://www.cassj.co.uk/blog/?p=359#comments</comments>
		<pubDate>Tue, 07 Jul 2009 12:44:33 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Galaxy]]></category>
		<category><![CDATA[ISMB2009]]></category>
		<category><![CDATA[NGS]]></category>

		<guid isPermaLink="false">http://www.cassj.co.uk/blog/?p=359</guid>
		<description><![CDATA[At ISMB, the Galaxy guys were talking about using Galaxy as an interface for analysing NGS data. I&#8217;m having a go at getting it up and running on EC2. Notes are really for my own reference, but I thought I&#8217;d post them in case they were of use to anyone else.
AWS Setup
Obviously, you need an [...]]]></description>
			<content:encoded><![CDATA[<p>At ISMB, the <a href="http://g2.trac.bx.psu.edu/">Galaxy</a> guys were talking about using Galaxy as an interface for analysing NGS data. I&#8217;m having a go at getting it up and running on EC2. Notes are really for my own reference, but I thought I&#8217;d post them in case they were of use to anyone else.</p>
<h3>AWS Setup</h3>
<p>Obviously, you need an <a href="http://aws.amazon.com/">AWS EC2 account</a> in order to get this working. Once you&#8217;ve got your account set up, install the AWS command line tools from <a href="http://developer.amazonwebservices.com/connect/entry.jspa?externalID=351">here</a> (Ignore the negative reviews, they work fine on Linux.). There are a few environment variables to set too. EC2_HOME is where the tools are, a location you&#8217;ll need to add to your path. You&#8217;ll also need to tell them where your AWS private key and x509 certificate files are  (files that should have been generated during AWS registration but, if not, see the <a href="http://docs.amazonwebservices.com/AmazonDevPay/latest/DevPayDeveloperGuide/index.html?X509Certificates.html">AWS x509 docs</a>). Something like the following in your ~/.profile should do the trick:</p>
<pre class="brush: bash;">
export EC2_HOME=/home/cassj/.ec2
export PATH=$PATH:$EC2_HOME/bin
export EC2_PRIVATE_KEY=/home/cassj/.ec2/amazon-pk.pem
export EC2_CERT=/home/cassj/.ec2/amazon-x509.pem
</pre>
<p>You&#8217;ll also need to create and register a key-pair to access your EC2 instances. The easiest way to do this is via the <a href="http://aws.amazon.com/console/">AWS management console</a> &#8211; it&#8217;s fairly self-explanatory. Save the .pem file somewhere on your local machine (I&#8217;m using <code>cassj.pem</code> ).</p>
<h3>Start Instance</h3>
<p>Create a security group for the Galaxy server for which we can open appropriate ports. Then run an instance. I&#8217;m using the <a href="https://help.ubuntu.com/community/EC2StartersGuide">official Ubuntu Intrepid x86 server AMI</a> as a base: <code>ami-5059be39</code>.  I&#8217;m also using <code>us-east-1b</code> cos that&#8217;s where the EBS volume with all my ChIPseq data lives. </p>
<pre class="brush: bash;">
ec2-add-group galaxy -d 'Group for Galaxy Server'
ec2-run-instances   ami-5059be39   --region us-east-1 --availability-zone us-east-1b --key cassj --group galaxy --instance-type m1.small --instance-count 1
</pre>
<h3>Connect to instance</h3>
<p>Open up the ssh port (Am just opening it to everyone. Alternatively, you can restrict the IP addresses using  <a href="http://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing">CIDR</a> format).</p>
<pre class="brush: bash;">
ec2-authorize galaxy -Ptcp -p22 -s 0.0.0.0/0
</pre>
<p>Run <code>ec2din</code> to check your instance is running and get its address, then ssh in using your keypair, something like:</p>
<pre class="brush: bash;">
ssh -i cassj.pem ubuntu@ec2-174-129-174-125.compute-1.amazonaws.com
</pre>
<h3>Install Galaxy</h3>
<p>Install Galaxy on your running instance. The following will grab the latest version from the repository and stick it in <code>/galaxy</code>.</p>
<pre class="brush: bash;">
sudo apt-get install mercurial
cd /
sudo hg clone http://www.bx.psu.edu/hg/galaxy galaxy
sudo chown -R ubuntu:ubuntu galaxy
cd galaxy
sudo sh setup.sh
sh run.sh
</pre>
<p>And modify the file <code>universe_wsgi.ini</code> so that the host is set to the appropriate place, eg</p>
<pre class="brush: bash;">
host = ec2-174-129-166-230.compute-1.amazonaws.com
</pre>
<p>Well, that was easy.  You seem to need to run <code>run.sh</code> as root initially, but after that it seems to be ok if you run as user ubuntu.</p>
<h3>Install Apache for static files</h3>
<p>By default Galaxy runs on port 8080. We&#8217;ll set up apache running on port 80, tell it to handle any of the requests for static files, to take the load off the Galaxy process and ask it to hand over anything else to Galaxy to deal with. So, install Apache2 and enable <code>mod_rewrite</code>, <code>mod_proxy</code> and <code>mod_proxy_http</code></p>
<pre class="brush: bash;">
sudo apt-get install apache2
sudo a2enmod rewrite
sudo a2enmod proxy
sudo a2enmod proxy_http
</pre>
<p>In <code>/etc/apache2/sites-available/default</code> this will redirect the stuff handed to Apache to Galaxy:</p>
<pre class="brush: xml;">
&lt;IfModule mod_rewrite.c&gt;
  RewriteEngine On
  RewriteRule ^/(.*) http://ec2-174-129-166-230.compute-1.amazonaws.com:8080/$1 [P]
</pre>
<p>And this will handle the limited number of static files that we want Apache to deal with:</p>
<pre class="brush: xml;">
  RewriteRule ^/static/style/(.*) /galaxy/test/static/june_2007_style/blue/$1 [L]
  RewriteRule ^/static/(.*) /galaxy/test/static/$1 [L]
  RewriteRule ^/images/(.*) /galaxy/test/static/images/$1 [L]
  RewriteRule ^/favicon.ico /galaxy/test/static/favicon.ico [L]
  RewriteRule ^/robots.txt /galaxy/test/static/robots.txt [L]
&lt;/IfModule&gt;
</pre>
<p>More info on installing Galaxy can be found on the <a href="http://g2.trac.bx.psu.edu/wiki/HowToInstall">wiki</a></p>
<p>Restart apache with <code>sudo /etc/init.d/apache2 restart</code>. </p>
<h3>Authorize Apache and Galaxy Ports</h3>
<pre class="brush: bash;">
ec2-authorize galaxy -Ptcp -p8080 -s 0.0.0.0/0
ec2-authorize galaxy -Ptcp -p80 -s 0.0.0.0/0
</pre>
<p>Now if you go to <code>http://&lt;Your AWS URL&gt;</code> you should see your Galaxy installation.<br />
It&#8217;s not going to be totally functional because we haven&#8217;t installed all of the underlying bioinformatics binaries but my plan is to have separate instances doing the actual analysis anyway. That&#8217;s tomorrow&#8217;s problem though&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cassj.co.uk/blog/?feed=rss2&amp;p=359</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>predicting functionality of protein-dna interactions by integrating diverse evidence</title>
		<link>http://www.cassj.co.uk/blog/?p=357</link>
		<comments>http://www.cassj.co.uk/blog/?p=357#comments</comments>
		<pubDate>Mon, 29 Jun 2009 14:04:47 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[ISMB2009]]></category>

		<guid isPermaLink="false">http://www.cassj.co.uk/blog/?p=357</guid>
		<description><![CDATA[Duygu Ucar
background about TFs. ChIPchip, ChIPseq can define binding (with noise), but knowing when binding is functional is less well characterized.
TF binding should be inferred from diverse and complimentary data. 
Binding doesn&#8217;t imply functionality. Binding and gene regulation are context dependant.
Functional vs Non-functional discrimination? Which factors determine functionality of binding? 
Yeast data.
Define binding::
 ChIPchip (pvals [...]]]></description>
			<content:encoded><![CDATA[<p>Duygu Ucar</p>
<p>background about TFs. ChIPchip, ChIPseq can define binding (with noise), but knowing when binding is functional is less well characterized.</p>
<p>TF binding should be inferred from diverse and complimentary data. </p>
<p>Binding doesn&#8217;t imply functionality. Binding and gene regulation are context dependant.</p>
<p>Functional vs Non-functional discrimination? Which factors determine functionality of binding? </p>
<p>Yeast data.<br />
Define binding::<br />
 ChIPchip (pvals corres to strength of binding),<br />
  PSSM (200bp down and 800 up of the start codon) assoc with a significance score,<br />
  Nucleosome Occupancy &#8211; significantly lower around binding site.</p>
<p>Bayesian prob of binding based on the 3 sets</p>
<p>ROC curve shows integrated model performs best (known interactions in normal growth conditions with 5-fold cross validation)</p>
<p>Assuming binding is functional if it changes gene expression &#8211; as measured with arrays.</p>
<p>integration with gene expression data &#8211; find instances where  differential binding correlates with differential expression (between normal growth conditions and stress condition).</p>
<p>Looked at functional binding rates in different stress conditions. Functional binding rate is context dependent &#8211; differs between stress conditions. Can rank the impact of particular transcription factors in different stress conditions.</p>
<p>Factors determining functionality (from FB and NFB enriched sets).<br />
Distance from the start codon, orientation wrt to direction of transcription. Presence of absence of co-factors on the same promoter.</p>
<p>Feature selection to determine important factors.: Multi-variate random forest classification algorithm. Feature importance score calc based on change in estimation error before and after permuting values of a vector. Identify significant factors for each (condition, TF) pair, calc p value.</p>
<p>Discriminatory factors are different in different conditions, but in most cases, co-factors were most important. Can use this data to determine significant TF-TF co-factor interactions.</p>
<p>Now have 2 sets, 1 enriched in functional binding sites and the other enriched in non-functional.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cassj.co.uk/blog/?feed=rss2&amp;p=357</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>ISA infrastructure</title>
		<link>http://www.cassj.co.uk/blog/?p=355</link>
		<comments>http://www.cassj.co.uk/blog/?p=355#comments</comments>
		<pubDate>Mon, 29 Jun 2009 13:12:54 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[ISMB2009]]></category>

		<guid isPermaLink="false">http://www.cassj.co.uk/blog/?p=355</guid>
		<description><![CDATA[Contextualise data with metadata
Only considering metadata &#8211; actual data is just in native format and linked to metadata.
Consistent reporting of metadata:
Define scope and minimal info (MIBBI http://mibbi.org), syntax (ISATAB)  and semantics (OBO)
http://www.mibbi.org/index.php/Main_Page
http://isatab.sourceforge.net/
http://www.obofoundry.org/
isacreator &#8211; standalone java app to annotate and link data on the basis of community-defined metadata ontologies.
poster c19
]]></description>
			<content:encoded><![CDATA[<p>Contextualise data with metadata</p>
<p>Only considering metadata &#8211; actual data is just in native format and linked to metadata.</p>
<p>Consistent reporting of metadata:<br />
Define scope and minimal info (MIBBI http://mibbi.org), syntax (ISATAB)  and semantics (OBO)</p>
<p>http://www.mibbi.org/index.php/Main_Page<br />
http://isatab.sourceforge.net/<br />
http://www.obofoundry.org/</p>
<p>isacreator &#8211; standalone java app to annotate and link data on the basis of community-defined metadata ontologies.</p>
<p>poster c19</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cassj.co.uk/blog/?feed=rss2&amp;p=355</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>ebi tools</title>
		<link>http://www.cassj.co.uk/blog/?p=353</link>
		<comments>http://www.cassj.co.uk/blog/?p=353#comments</comments>
		<pubDate>Mon, 29 Jun 2009 12:41:36 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[ISMB2009]]></category>

		<guid isPermaLink="false">http://www.cassj.co.uk/blog/?p=353</guid>
		<description><![CDATA[Improvements to framework. Better input validation etc.
advanced results analysis and rendering. 
data model shared by all analysis tools within a category
standard web interface www.ebi.ac.uk/Tools/category/tool
similarly SOAP API:  Tools/services/soap/tool?wsdl
            REST API: Tools/services/rest/tool
Soap WS GetParameters(), GetParameterDetails()
REST WS GET /parameters
        [...]]]></description>
			<content:encoded><![CDATA[<p>Improvements to framework. Better input validation etc.</p>
<p>advanced results analysis and rendering. </p>
<p>data model shared by all analysis tools within a category</p>
<p>standard web interface www.ebi.ac.uk/Tools/category/tool<br />
similarly SOAP API:  Tools/services/soap/tool?wsdl<br />
            REST API: Tools/services/rest/tool</p>
<p>Soap WS GetParameters(), GetParameterDetails()<br />
REST WS GET /parameters<br />
             GET /parameterdetails/param</p>
<p>Running:<br />
        SOAP: Run(input) returns job id<br />
                 GetStatus(jobid) returns RUNNING, FINISHED&#8230;<br />
        REST:<br />
             POST /run<br />
              GET /status/jobid</p>
<p>Retrieving results:<br />
   SOAP:<br />
         GetResultTypes(jobid)<br />
        GetResult(jobid, type)<br />
            view of result is base64 encoded<br />
   REST<br />
         GET /resulttypes/jobid<br />
         GET /result/jobid/type<br />
                 get result back with appropriate MIME type</p>
<p>Tools are categorised and subcategorised. Also have context, eg proteins, nucleotide and even more specific.</p>
<p>demo&#8230;<br />
shiny<br />
still in beta. </p>
<p>Can get the results as XML (with common data model across tool category so eg. all sequence search XML can be parsed the same)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cassj.co.uk/blog/?feed=rss2&amp;p=353</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>ELIXIR &#8211; sustainable infrastructure for bio info in Europe</title>
		<link>http://www.cassj.co.uk/blog/?p=343</link>
		<comments>http://www.cassj.co.uk/blog/?p=343#comments</comments>
		<pubDate>Mon, 29 Jun 2009 09:42:09 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[ISMB2009]]></category>

		<guid isPermaLink="false">http://www.cassj.co.uk/blog/?p=343</guid>
		<description><![CDATA[Janet Thornton
Disclaimer: My notes. Fairly incoherent and probably not accurate  
ELIXIR is a European effort to co-ordinate the infrastructure. Preparatory project &#8211; 10-20yr roadmap for infrastructure devel to support research.  
EU 32 partners, 13 member states. 4.5mill E funding to define scope, cost of infrastructure.
Goals:
Co-ordinated data resources, integration &#38; interoperability of data, links [...]]]></description>
			<content:encoded><![CDATA[<p>Janet Thornton<br />
Disclaimer: My notes. Fairly incoherent and probably not accurate <img src='http://www.cassj.co.uk/blog/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<p>ELIXIR is a European effort to co-ordinate the infrastructure. Preparatory project &#8211; 10-20yr roadmap for infrastructure devel to support research.  </p>
<p>EU 32 partners, 13 member states. 4.5mill E funding to define scope, cost of infrastructure.</p>
<p>Goals:<br />
Co-ordinated data resources, integration &amp; interoperability of data, links to data in other domains, open access to data, enhance euro competitiveness in bioscience, address need for increased funding adn its co-ord.</p>
<p>v young science. Funding streams for infrasxr not in place.</p>
<p>stakeholders: users, experimentalists (data provision), resource providers (core &#038; specialist), tool providers (bioinformaticians), funders &#8211; govt bodies, EMBL, EU charities, Industry.</p>
<p>Challenges prompting ELIXIR: data growth, global context, large and distributed userbase, preservation &amp; accessibility of data, impact on biosciences, growth of funding</p>
<p>Cost of maintaining data is insignificant compared to cost of data generation. Makes sense to fund.<br />
Integration increasingly important as academic, molecular type data is increasingly needed by medicine, agriculture etc</p>
<p>ESFRI: Biology research infrastructure proposals. ELIXIR will support these.</p>
<p>Reports from initial committee meetings (userbase consultation) due now &#8211; will define the scope &#038; remit of ELIXIR. Then work on international agreement for goals, costs then look at how to fund.</p>
<p>Can&#8217;t keep everything centralized &#8211; need more distrubution. Hub at EBI and nodes in diff member states</p>
<p>Will provide: core and specialist data resources, compute centres, infrastructure for tools and services integration, support for Bio ESFRI projects, community support and training.</p>
<p>DB survey. 170 DBs across EU. Many of the core DBs are at the EBI, but are distributed in the sense that data providers are across Europe. Also many specialist resources across EU. All of these use the core resources as reference data. DB sizes follow power law &#8211; most &lt;10GB but a few are huge. All have web browser queries. Some still have email query. about 70% have data downloads and about 30% have programmatic access.  39/170 have some restrictions on data access (legal or practical). A fairly high proportion have no funding. Most of them cost &lt; mill euros. About 40 mill euros a year being spent at the moment on these DBs. Total invest to date is 308 mill euros. 90% have less than 3 year funding security. Most have less than 50K unique users /month, but a few have many more. Most have &lt;5 staff, a few have many. Many don&#8217;t have any members of staff. See Poster E41. for details.</p>
<p>So &#8211; ELIXIR needs to co-ordinate, prioritise and stabilise funding for these resources.</p>
<p>Databases relatively under control compared to other aspects: Standards and ontologies, Literature, Other domains (medical data, biodiversity data etc), Integration</p>
<p>Don&#8217;t need to centralise standards devel, is fine for them to come out of communities, but do need to encourage and publicise standards. OBO.</p>
<p>Lit: integrated, open access text-based lit resource would be nice <img src='http://www.cassj.co.uk/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>compute resources&#8230;? Other domains deal with much bigger scale data (CERN), but they have fewer users and bioinfo data is growing at an exponential rate. Can&#8217;t chuck NGS data around the web. So &#8211; what do we need to keep? Should it be centralised? Probably need biodata grid like CERN (only more complex).<br />
Modularise organisation of dataresources. Build network of biocomp resources. Catalyst devel of web services and cloud computing. More program to data rather than other way round. Work with EU supercomputing centres.</p>
<p>User priorities: integration, format compatibiltiy, website usability.<br />
Short term: acecss &#8211; programmatic, web-site, web service, downloads. Develop well-maintained catalogue.<br />
Long term &#8211; integration of data and tools. Encourage commercial tools to adopt open standards</p>
<p>Co-ord training.</p>
<p>Comments:</p>
<p>DB developers should have to abide by standards in order to publish / be funded by ELIXIR<br />
Global context is important and ELIXIR will take into account international collab models in funding approaches. Data sharing will be required.<br />
ELIXIR not about providing national infrastructure &#8211; this should come from per-country funding. Only interested in pan-EU infrastructure. Prob national nodes would be well set up to  also provide pan-EU function though (shared compute etc).</p>
<p>May call for proposals for nodes from EU countries, although no actual funding and no mechanism for deciding which would be accepted yet so proposals a bit hypothetical.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cassj.co.uk/blog/?feed=rss2&amp;p=343</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>NestedMICA motif discovery</title>
		<link>http://www.cassj.co.uk/blog/?p=338</link>
		<comments>http://www.cassj.co.uk/blog/?p=338#comments</comments>
		<pubDate>Sun, 28 Jun 2009 13:12:45 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[ISMB2009]]></category>

		<guid isPermaLink="false">http://www.cassj.co.uk/blog/?p=338</guid>
		<description><![CDATA[Matias Piipari
http://www.sanger.ac.uk/software/nmica
scalable. 1000s sequences. 100s motifs highly parallel. extendable
Actually, looking at metamotifs and motif classification in this talk.
Different TF families have different modes of binding.
Can build a metamotif that encodes the distribution across a number of motifs in a family. Like a PWM, but with confidence intervals. Developed a nested sampler to learn these from [...]]]></description>
			<content:encoded><![CDATA[<p>Matias Piipari</p>
<p>http://www.sanger.ac.uk/software/nmica</p>
<p>scalable. 1000s sequences. 100s motifs highly parallel. extendable</p>
<p>Actually, looking at metamotifs and motif classification in this talk.<br />
Different TF families have different modes of binding.</p>
<p>Can build a metamotif that encodes the distribution across a number of motifs in a family. Like a PWM, but with confidence intervals. Developed a nested sampler to learn these from a set of motifs (eg for bHLH TFs) describes repeating patterns in a set of motifs. Can use these as a prior for motif finding which improves detection of real motif in a test set. (I think&#8230;)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cassj.co.uk/blog/?feed=rss2&amp;p=338</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Seeder</title>
		<link>http://www.cassj.co.uk/blog/?p=336</link>
		<comments>http://www.cassj.co.uk/blog/?p=336#comments</comments>
		<pubDate>Sun, 28 Jun 2009 12:53:40 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[ISMB2009]]></category>

		<guid isPermaLink="false">http://www.cassj.co.uk/blog/?p=336</guid>
		<description><![CDATA[Perl cis-reg motif discovery perl mods. For designing synthetic promoters. Motifs built from promoter proximal sequences in plants. Didn&#8217;t pay attention to the actual algorithm, but see: http://seeder.agrenv.mcgill.ca. On CPAN.
]]></description>
			<content:encoded><![CDATA[<p>Perl cis-reg motif discovery perl mods. For designing synthetic promoters. Motifs built from promoter proximal sequences in plants. Didn&#8217;t pay attention to the actual algorithm, but see: http://seeder.agrenv.mcgill.ca. On CPAN.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cassj.co.uk/blog/?feed=rss2&amp;p=336</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>bioHDF</title>
		<link>http://www.cassj.co.uk/blog/?p=334</link>
		<comments>http://www.cassj.co.uk/blog/?p=334#comments</comments>
		<pubDate>Sun, 28 Jun 2009 12:20:43 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[ISMB2009]]></category>

		<guid isPermaLink="false">http://www.cassj.co.uk/blog/?p=334</guid>
		<description><![CDATA[Spoke to a guy working on bioHDF (GeoSpiza) over lunch. Missed his talk yesterday. Alternative to SAMtools BAM format. HDF has been around for a long while, they&#8217;ve stuck a Bio layer on top of it. Plan is to biolib-ify it to get Bio-perl &#124;java&#124;python&#124;condutor) integration. Might be worth thinking about for the Gbrowse2 display [...]]]></description>
			<content:encoded><![CDATA[<p>Spoke to a guy working on <a href="http://www.geospiza.com/research/biohdf/index.html">bioHDF</a> (GeoSpiza) over lunch. Missed his talk yesterday. Alternative to SAMtools BAM format. HDF has been around for a long while, they&#8217;ve stuck a Bio layer on top of it. Plan is to biolib-ify it to get Bio-perl |java|python|condutor) integration. Might be worth thinking about for the Gbrowse2 display of short read data?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cassj.co.uk/blog/?feed=rss2&amp;p=334</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Debian Bioinformatics OSS stuff</title>
		<link>http://www.cassj.co.uk/blog/?p=328</link>
		<comments>http://www.cassj.co.uk/blog/?p=328#comments</comments>
		<pubDate>Sun, 28 Jun 2009 10:31:10 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[ISMB2009]]></category>

		<guid isPermaLink="false">http://www.cassj.co.uk/blog/?p=328</guid>
		<description><![CDATA[Lightning talk
Steffan Moller
Debian-Med
recent needs placed on distros: high performance computing, cloud/grid &#8211; bootable images with customisations. exact spec of software and env.
increased focus on: libraries &#8211; distributing software, data management &#8211; using software
debian-med.alioth.debian.org
medical and bioinfo packages. svn package maintenance. Open community
Extension of Deb, not a split.
All packages autobuild from source for > 10 arch
Packages auto-forwarded [...]]]></description>
			<content:encoded><![CDATA[<p>Lightning talk<br />
Steffan Moller<br />
Debian-Med</p>
<p>recent needs placed on distros: high performance computing, cloud/grid &#8211; bootable images with customisations. exact spec of software and env.<br />
increased focus on: libraries &#8211; distributing software, data management &#8211; using software</p>
<p><a href="debian-med.alioth.debian.org">debian-med.alioth.debian.org</a><br />
medical and bioinfo packages. svn package maintenance. Open community<br />
Extension of Deb, not a split.<br />
All packages autobuild from source for > 10 arch<br />
Packages auto-forwarded to Ubuntu.<br />
&gt; 160 packages</p>
<p>current work: more communication of functionality.<br />
GSoC projects &#8211; improved cloud prep. Data processing script (?)</p>
<p>Have got an EC2 AMI, but not public yet.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cassj.co.uk/blog/?feed=rss2&amp;p=328</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>biolib Pjotr Prins</title>
		<link>http://www.cassj.co.uk/blog/?p=324</link>
		<comments>http://www.cassj.co.uk/blog/?p=324#comments</comments>
		<pubDate>Sun, 28 Jun 2009 10:23:40 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[ISMB2009]]></category>

		<guid isPermaLink="false">http://www.cassj.co.uk/blog/?p=324</guid>
		<description><![CDATA[generic C/C++ libraries for Bio::* projects. esp for data IO/parsing/interpretation
In &#60;language of choice&#62; bind &#60;generic C implementation&#62; using SWIG with Biolib
Limited pool of bioinf programmers and being spread too thin across diff language specific projects. Shared stuff good cos less duplication, more people testing code etc. Inexplicable picture of young girl. Man with beard.
Bioperl on [...]]]></description>
			<content:encoded><![CDATA[<p>generic C/C++ libraries for Bio::* projects. esp for data IO/parsing/interpretation</p>
<p>In &lt;language of choice&gt; bind &lt;generic C implementation&gt; using SWIG with Biolib</p>
<p>Limited pool of bioinf programmers and being spread too thin across diff language specific projects. Shared stuff good cos less duplication, more people testing code etc. Inexplicable picture of young girl. Man with beard.</p>
<p>Bioperl on Github. Uses Cmake (as opposed to autoconf? &#8211; modular, resolves complex dependencies) and SWIG (rules for generating code, DRY, pattern matching, multi-language support (&gt;20))</p>
<p>Year One:<br />
AffyIO<br />
Staden sequencer trace files<br />
GSL (GNU Science Library)<br />
Rlib R stuff<br />
R/qtl quant genetics<br />
Libsequence sequence analysis<br />
Bio++ sequence analysis</p>
<p>Future:<br />
Automated API doc<br />
more libs (Emboss, NCBI)<br />
more languages<br />
Bio* integration (CPAN, Ruby gems etc)<br />
Distribute as packages</p>
<p>NB: BoF session at 16.50</p>
<p>Note to self: SAMtools?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cassj.co.uk/blog/?feed=rss2&amp;p=324</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
