SBOL Workshop

The
Inn at Virginia Tech

Duck
Pond Meeting Room – Upper level

Blacksburg,
VA

January
7-10, 2011

Objectives
of the Workshop


The points we would like to stabilize are:



  • development of a controlled vocabulary for the different categories
    of genetic parts (names and definition)


  • development/validation of graphical representations of these
    elements


  • specification of relations indicating how these elements can be
    combined in genetic constructs


  • specification of rules to annotate DNA sequences (how features of
    constructs are inherited from the annotation of the
    parts)


  • specification of a file format making it possible to exchange
    sequences between software applications.


  • agreement to support SBOL in future projects


The deliverables of the workshop will be:



  • a manuscript describing SBOL (may not be completely flushed out but
    fairly close)


  • a clear avenue allowing our respective software to be SBOL
    compliant


  • a process to revise the SBOL specification as we go


  • a fund raising strategy to support the development of this community
    building effort

Workshop
Confirmed Attendees



  1. Michal Galdzicki

    (University of Washington)


  2. Deepak Chandran

    (University of Washington)


  3. Herbert Sauro

    (University of Washington)


  4. Cesar A. Rodriguez

    (BIOFAB)


  5. Guy-Bart Stan

    (Imperial College)


  6. Douglas Densmore

    (Boston University)



  7. Chris Myers

    (University of Utah)


  8. Barry Moore

    (University of Utah)


  9. Mandy Wilson

    (VBI)


  10. Matt Lux

    (VBI)


  11. Brad Howard (VBI)


  12. Julie Marchand (VBI)


  13. Laura Adam

    (VBI)



  14. Jean Peccoud

    (VBI)


  15. Nicholas Roehner

    (University of Utah)


  16. Akshay Maheshwari

    (UCSD)


  17. David Ball (VBI)

Agenda

Friday:
Current State of SBOL



  • Continental Breakfast Buffet

    (8:00-9:00am)


  • Logistic

    (9:00-10:00am)


    • Introductions


    • Review the objectives of the workshop


    • Refine the agenda


  • Break

    (10:00-10:30am)


  • SBOL Tools and Concepts

    (10:30am-12:00pm)


    • SBPkb & libSBOL (10:30-11:15pm)


      • Presenter: Michal


    • SBOL visual (11:15-12:00pm)


      • Presenter: Cesar


      • Quick tour of software tools using SBOLv (in order of
        standard adoption)


        • Spectacles, TinkerCell, Device Editor/J5


  • Lunch

    – Preston’s Restaurant (12:00-1:00pm)


  • SBOL Use Cases

    (1:00-2:30pm)


    • BIOFAB Data Access Web Client and Service


      • Presenter: Cesar


    • BIOFAB UK-US “Bit-to-Atom-to-Bit”
      Exchange


      • Presenter: Guy


    • BioCompiler


      • Presenter: Doug


    • GenoCAD


      • Presenter: Mandy Wilson


    • TinkerCell


      • Presenter: Deepak


    • OpenSequenceAssembler


      • Presenter: Cesar


    • iBioSim


      • Chris

        Myers


  • Break

    (2:30-3:00pm)


  • Group Discussion

    (3:00 pm – 5:00 pm)


    • Finalize SBOL controlled vocabulary and graphical
      representations


  • Community engagement and outreach

    (evening)


    • Governance


    • Open and Free Licencing Policy


    • Software and data licensing policies


    • Community engagement and communication

Saturday:
SBOL Semantics



  • Continental Breakfast Buffet

    (8:00-8:30am)


    • The Sequence Ontology

      (9:00-10:00am)


      • Presenter: Barry


    • Break

      (10:00-10:30am)


    • Knowledge representation frameworks: ontologies and
      grammars

      (10:30am-12:00pm)


      • Presenter: Jean


  • Lunch

    – Preston’s Restaurant (12:00-1:00pm)


  • Group discussion SBOL Semantics

    (1:00-2:35pm):


    • Relationships between SBOL sequence concepts


    • SBOL Semantics


    • SBOL Script/Grammar


  • Group discussion fund raising strategy

    (evening/afternoon)


    • RCN Grant


    • SI2 Grants


    • NIH Workshop grants


    • DOE grant

Sunday:
Data Exchange and Community Engagement



  • Continental Breakfast Buffet

    (8:00-8:30am)


  • Legacy Formats

    (8:30-9:30am)


    • FASTA, GenBank Files: parts libraries, designs


    • CSV Files: A quick and easy means for encoding synthetic
      biological component performance


    • FCS Format


  • Future Directions

    (10:00- 12:00pm))


    • SBOL as a unified data exchange format


    • SBOL and SBML


  • Lunch

    – Preston’s Restaurant (12:00-1:00pm)


    • [Herbert Leaves 2pm- funding must be before]


  • Group Discussion

    (1:00-5pm)


  • Finalize SBOL controlled vocabulary and graphical
    representations


  • SBOL-semantic as the proposed data exchange format

Monday:
Funding SBOL development



  • Manuscript brainstorming, outlining, writing

    (8:30am-12pm)


Community Engagement

Action
Items


1 Prepare governance document for SBOL wiki


    Herbert will draft a governance document from the
SBML project


2. Talk to the Biobricks foundation about trademarking SBOL
(doug)


    a ) Copyright specifications


    b) Creative Commons of SBOL Visual


c) Openness of SBOL Semantic, eg owner status


d) Decide license for software libraries


3. Implement and Test first Draft of Level 1:


    a) Finish first draft of libSBOL, distribute to
developers by the end of Feb 2011


        i) Export/Import XML
(Michal)


        ii) Export/Import JSON
(Cesar)


        iii) Export/Import GenBank
(Deepak/Michal/Cesar)


                       iv)
Export/Import GFF3 (??)


    b) Write up initial draft SBOL spec document upload
to the wiki by end of Feb 2011
             
(Formalize as Update to BBF RFC 16, 30, 31, 33, 68)


    c) Plan initial community paper on Level 1
SBOL


    d) Clotho will implement SBOL import/export by end
of April


    e) TinkerCell will implement SBOL import/export by
the end of April


    f) GenoCAD  will implement SBOL by the end of
April


    g) Imperial College
    h) BIOFAB will implement SBOL export by the end of
April


5. First week of May meet via video conference call to assess progress
and plan for the rest of May.


5. Finalize SBOL Level 1: Tentatively May – Before the
SB5.0


a) Any revisions and feedback from Action 3 should inform the final
release

   
b) Write up final SBOL spec document, upload to the wiki, finalize
papers


6. Officially begin process of developing Level 2 at SB5.0


7. Conferences


    a) IWBDA (All day 5th of June) –  Developer
Meeting


    b) SB 5.0 (15th to 19th June) – Public Meeting
(Details to be decided)


8. Investigate SBOL award for IGEM sponsored by the Synthetic Biology
Data Exchange Group


SBOL Questions:


1. What should be approach taken with optional or new fields that are
not part of he standard?

SBOL
Specification Motions


SequenceFeature Class


1. “type” will remain the name of the field and will
indicate the Sequence Ontology Id


   Motion Carried


2.  Rename the top three fields to displayId, name, description.
We require the


     “displayId” be
alphanumeric/underscore and starting with a letter or underscore.
                
“name” will be a short description that will be familiar to
a user, eg “Lac Operon”
                          


     Motion
Carried   


3. Add sequence field in SequenceFeature


    Motion Carried


SequenceAnnotation Class


           1. SequenceAnnotation should
include two fields, “start” and “strand”.
“strand” indicates
               the
orientation of the strand, either positive or negative.
               Motion
Carried


            2.
Add a “stop” field to the SequenceAnnotation
               Motion
Carried


Part Class


1. Combine “Part” and “PhysicalDNA”  and
rename to “DNAComponent” with attributes


    “displayId”, “name”,
“description”, “isCircular” and
“dnaSequence”.


     Motion Carried (Strong objection from Michal:
He thinks that physcialDna is a class


     that we need).


New Classes and Other Items


1. Provide the DNA sequence in its own class. The are two
fields,


    “dnaSequence” which stores a string
representation of the DNA base-pair sequence


    and a uri field, “DNARef”, which is a
reference.


    Motion Carried

   
        


2.  The two fields in the DNA Sequence should be mutually
exclusive, this is a data


      restriction and will be included in a
list of validation rules.


     Motion Carried


3. The DNA coordinate system will be one based.


    Motion Carried


4. Start has to be a number >= 1. Stop >= Start and Stop – Start
+ 1 must equal the DNA


    sequence length of the corresponding feature. This
will be included in the list of


    validation rules.


   Motion Carried


5. The DNA sequence will be represented using


    a.The DNA sequence will use the IUPAC ambiguity
recommendation.


    (See
http://www.genomatix.de/online_help/help/sequence_formats.html)


    b.Blank lines, spaces, or other symbols must
not be included in the


    sequence text.


    c.The sequence text must be in ASCII or UTF-8
encoding. For the


    alphabets used, the two are
identical.


    Motion Carried


            6. The DNAComponent
should have an additional field, called “type”. This field
will
                
represent the physical type of the DNA, for example, oligo, insert,
plasmid.


                
Motion Denied


            7. A new class should be
included in SBOL, called “library”. This class will have
three
               
fields, “displayId”, “name”,
“description”. A library object can reference 0..n
               
DNAComponent objects and 0..n SequenceFeature objects.


               
Motion Carried
 
   


        


Interim Level 1.5 Amendments


1. To add four additional optional uri references to the
DNAComponent:


    a) uri to a model specification (Deepak, Chris,
Herbert, Jean, Mandy)


    b) uri to experimental data, includes sample
location (Doug, Cesar, Jean, Mandy)


    c) uri physical assembly information (Doug, Cesar,
Jean)


    d) uri to a visualization format, cf. TinkerCell,
iBioSIM, Clotho (Deepak, Herbert, Chris,    
    Doug, Jean, Mandy)


    Communication via SBOL developers mailing group
(To be created: Mike)


    Motion Carried


Level 2 Suggestions:


Create an author class and add a reference field to the DNAComponent
class.


Discussion of a license field and licensing issues is
required


Unresolved Motions


5. The “displayId” should be unique amongst all sequence
features.

Other
Notes


The
DNA sequence in a feature must match the DNA sequence component that it
aligns to.


UML Diagram


Misc:


Governance


SBML:


meetings twice a year


Editors / Mailing list / Voting


Core data model and packages for different applications in SBML
v3.


-This has been the assumption so far


Funding needed for libSBOL like libSBML


libSBOL documentation needed to be strong to help adoption


Funding SBML


DARPA / Japan / DOE / NSF


Intellectual property: trademark? Nobody really knows how to
“protect” the freedom of a standard? BBF could trademark
SBOL?


Icons, standards could be licensed with creative commons attribution
only.


Guy: How to introduce newcomers?


Jean: To do a larger group we need an organized effort to have
meetings. To grow the community and to be inclusive.


Hackathons and Forum meetings.


Doug:


Herbert: Finish SBOL Level 1 (Feb)


Deepak: We need concrete use cases


Cesar: How is this advantageous compared to GenBank?


Matt: Level 1 should exclude functional information


Chris Meyers: Versions need to be enforced to call it a standard so
that when you have something you know whether it is compliant or
not.


Herbert: Editors are the gatekeepers, they decide what is in and out of
the spec. They maintain the documentation.


Doug: SB 5.0 is a better venue for SBOL workshop, inflitrate by any
means necessary


Chris: There needs to be a specification!


Doug: Adopt the SBML charter as a model, then move from
there


Kevin Clancy: Could presentations please be posted for the
meeting?


Funding

NSF
ABI: Add a participant cost section to a research proposal to ABI
(everyone is involved)


Univeristy of Utah: Seed level funding (Karen/Barry/Chris)


NSF RCN: Deadline July 5 (Jean will take the lead). A copy of this
proposal will be sent to DOE to get funded through end of the year
funds (see with Herbert)


NCI/NIH: worshop grant NIH –
http://grants.nih.gov/grants/guide/pa-files/PA-10-071.html (Herbert
will lead a SBOL Workshop a Doug will have IWBDA that will include a
SBOL)


NIH – NIH Small Research Grant Program –

http://grants.nih.gov/grants/guide/pa-files/PA-10-064.html


NSF SI2: Jean will lead a SynBio portal proposal with SBOL


Suggested SBOL API Functions


—– “get” functions —–


note: in the list below func (x, args) can also be restructured as
x.func (args)


// File with Libraries AND/OR DNAComponents AND/OR
“whatever”


SBOLObject readFile(fileName)  


SBOLObject readString(string)


List      getLibraries(SBOLObject)
   // SBOL object with 0 or more libraries


Library readLibraryFile(fileName)     // File with
ONE library


Library readLibraryString(string)      //
 String with ONE library


List getAllDNAComponents(SBOLObject)


List getAllDNAComponents(Library)


List getAllSequenceFeatures(SBOLObject)


List getAllSequenceFeatures(Library)


List getOrderedSequenceFeatures(DNAComponent) // Features sorted
according to start position


List getOrderedSequenceAnnotations(DNAComponent) // Annotations sorted
according to start position


isA(SequenceFeature, string?)  // Checks if the given feature
“is a” type given by the second arg


Example pseudo-code:


SBOLObject sbol = readFile( “myfile.json” )


List components = file.getAllSequenceComponents(sbol)  // What is
the the file. object?


for each (component c in components)


    print c.sequence


—- “set”, “create” and “write”
functions —


note: in the list below func (x, args) can also be restructured as
x.func (args)


SBOLObject createSBOL()


writeXMLFile(SBOLObject, filename)


writeJSONFile(SBOLObject, filename)


writeFlatFile(SBOLObject, filename)  // Do you mean GenBank
format?


DNASequence createDNASequence(string?)


DNAComponent createDNAComponent(SBOLObject)


setSequence(DNAComponent, DNASequence)


DNAAnnotation createDNAAnnotation(SBOLObject)


setSequenceFeature(DNAAnnotation, DNAFeature)


setStartPosition(DNAAnnotation, int start)    // Stop
field is determined automatically


DNAFeature createSequenceFeature(SBOLObject)


setSequenceType(DNAFeature, string?)