Saturday, June 25, 2011

Updates : Porting Neologism to Drupal 7 [3]

A lot has happened in the project since the last post. In fact, I am ready with my mid-term submission.

The three content types : vocabulary, class and property have been added to the port. The fields that have been added are :

Vocabulary :
  • Title
  • Namespace URI
  • Authors
  • Abstract
  • Body
  • Additional Custom RDF

Class :
  • Related vocabulary
  • Class URI
  • Label
  • Comment
  • Superclass
  • Disjoint with 
  • Details

Property :
  • Related Vocabulary
  • Property URI
  • Label
  • Comment
  • Details
  • Functional Property
  • Inverse Functional Property
  • Domain
  • Range
  • Superproperty
  • Inverse
The vocabulary, class and property are correctly being registered with evoc.

Next steps are mentioned by Richard here-
http://drupal.org/node/1196510




Thursday, May 26, 2011

Updates : Porting Neologism to Drupal 7 [2]

Coding period has started.

We decided to start by creating bundles for :
  • Vocabulary
  • Class
  • Property
References module will be used for creating fields field_type Node Reference.


Monday, May 16, 2011

Updates : Porting Neologism to Drupal 7

Wiki page of the Project : http://groups.drupal.org/node/145269

Project page on drupal.org : http://drupal.org/project/neologism

We would be using the issue queue on drupal.org for all discussions related to the project and the entire drupal community can provide suggestions. This is the link to the issue queue of Neologism :
http://drupal.org/project/issues/neologism?categories=All

Lin Clark gave a short introductory session over skype on how to use the issue queue on drupal.org and the git and told about various portals which could be useful during the project like http://drupal.stackexchange.com/

Richard created a branch for the project on the drupal.org Neologism repository. We had two options, either we could clone the entire D6 code into the branch and then transform it to a D7 Module step by step, or start with an empty branch and then add feature after feature. We chose the second option since it was cleaner and always gave us a working drupal 7 module.

Also, I would be working on my new MacBook Pro for GSoC and since I had previously worked only on Windows while developing on Drupal, I contacted Guido, who is currently the main developer of Neologism, so he could help me with the development environment for the project.

I have a basic Drupal 7 website running on MAMP and will be using Komodo for development and Command Line Interface for GIT.

GSoC 2011 Proposal for Drupal : Porting Neologism to Drupal 7

I support Drupal's vision to become the best CMS in projects related to Semantic web.

In view of the above initiative, Drupal 7 comes with a core RDF module. There is a contributed RDF module as well which allows us to extend the functionality of the core module through :
1. RDFx - Provides additional serialization formats.
2. RDF-UI - UI to specify RDF Mappings.
3. Evoc    - UI to import Vocabularies and store them in DB.



For different sites to have perfectly interoperable RDF, they should use the same RDF vocabulary. It is a fact that presently no comprehensive vocabulary exists which can provide predicates to suit each and every Semantic web-development project. For instance, foaf may be a very good choice of RDF vocabulary when it comes to building social networking web-sites. However, if one wants to create a project which involves Learning Resource Objects, foaf would not be able to provide predicates for all the Learning object metadata elements. That's why it's important to be able to create new vocabularies according to the specific requirements of the project.


As of now, there is no standard User interface for creating an entire RDF Vocabulary in Drupal 7. One has to write an RDF Schema in XML format and then register it with Drupal using Evoc.

My initial idea :

I initially wanted to extend the functionality of the Contributed RDF Module by adding a User Interface to "create" and "register" customised RDF-vocabularies. The module would generate the corresponding RDFS in backend and allow the user to register the vocabulary with the Drupal and provide an easy UI for creating the vocabularies in the frontend.

Why I felt the Need for this extension :
Since the Drupal has a relatively steep learning curve, we must try to make things as easy as possible for the newbies so that more and more people can enthusiastically join Drupal's Semantic initiative and start using Drupal for their Semantic Web Projects.

Using the User Interface that I planned to develop, someone with even a little knowledge of writing an RDF Schema would be able to create and register his own vocabulary.

Change of Plans :

I thus went ahead and posted my proposal on GSoC-11 Drupal Group.

As you can see here(http://groups.drupal.org/node/136969), the discussions turned out to be very fruitful indeed. Lin Clark advised me to have a look at the ongoing Neologism Project(http://neologism.deri.ie/), which provided a free and open-source vocabulary publishing platform. It turned out to be functionally very similar to what I had planned for my project. However, Neologism is not yet available as a module for Drupal 7, Lin also advised me to get in touch with Richard Cyganiak (http://richard.cyganiak.de/), who was actively working on Neologism development.

Neologism is a powerful codebase for publishing customised vocabularies that is already in quite some use in the RDF community, but using it in existing Drupal sites is difficult since there is no dedicated D7 Module. Moreover, the code is hosted on Google Code Repositories. To confuse matters further, there is a very old version of the neologism module on Drupal.org, which was not updated as the project progressed on Google Code. It also has several dependencies, a few of which are not even easily available on the internet since the previously existing links are now broken. So there was scope of collaboration.

I contacted Richard and we discussed my project proposal over several emails. Richard informed me that he would soon be working to port Neologism to D7. I offered to do it as a part of my GSoC project. I felt it was better to contribute to Neologism module rather than creating another module from scratch which overlaps in functionality with the upcoming Neologism module. Moreover, Neologism has many good features like a vocabulary overview diagram and a time tested User Interface, due to which it makes even more sense to port it to D7. Richard liked the idea of pooling our resources to work for a common cause. He also agreed to mentor the project.


Finally, we came up with the following abstract for the project :

  1. Porting Neologism to D7
  2. Migrating the Neologism code-base and documentation from Google Code to drupal.org
  3. Updating the documentation and informing existing users about the change.
  4. Testing that the Neologism module works well in existing D7 sites
I intend to carry forward the work that has already been put into creating the Neologism vocabulary publishing platform by porting it to D7 and making it available to the huge Drupal community and any existing Drupal sites that want to use RDF with custom vocabularies.

At this moment, I believe that that the Evoc module in D7 provides all the features that we need to successfully create the Neologism module.



Timeline for the Proposed Project:


April 25 - May 23 (Before official coding period starts) [Information Learning Curve and Background readings]
  • Familiarise with the current Neologism codebase and Drupal RDF modules.
  • Go through the current documentation of the Neologism project.
  • Discuss the implementation plans and risks with the mentors.
  • Familiarise the coding standards and development practices followed while creating Drupal modules.
  • Get used to working on the Drupal Repositories since code migration from Google code to Drupal repositories would also be a part of the SoC project.
   
May 23 - 29 (First week) [Familiarizing]

  • Fix some bugs/implement simple features for the current Neologism platform to familiarize further with the codebase.
  • Create a document for general reference which describes how the module would appear at the end of the Summer of Code. Documentation at this stage would not go into the technical details but only describe how the module would appear to the end user at the end of the project.

May 30 - June 5 (1 week) [DB Migration]
Neologism is currently running on D6. There are a lot of differences between the Evoc module in D6 and D7. Thus, we need to change the DB Schema of Neologism to match the D7 Version of Evoc.

This marks the End of Phase-1.
At this moment, we are ready to start porting Neologism to D7.


June 6 - July 24 (7 weeks)[Porting Neologism to D7]
This is the major task of the project. This task has been further divided into sub-tasks as follows :
Week 1 : Port the menu system and vocabulary list to D7

Week 2 : Port the vocabulary overview page to D7
Week 3 : Port the RDF output to D7
Week 4 : Provide the feature of importing and loading vocabulary by using the evoc module
Week 5 : Port the vocabulary creation/edit form to D7
Week 6 : Port the class/property creation/edit forms to D7
Week 7 : Port content negotiation and caching to D7

Also, during this period, I would need to carry out integration testing for the module.

This marks the end of Phase 2.
At this stage, we have a functional D7 port of Neologism module.

July 25 - July 31 (1 week) [Documentation Migration/Upgrading and Migrating the code to Drupal Repository]
The tasks planned for this phase are as following :
  • Set up Drupal.org infrastructure for neologism module
  • Coordinate with documentation team to move existing documentation to drupal.org   
  • Update documentation wherever needed
  • Notify existing users of the changes
August 1 - August 7 (1 week) [Test the module on existing Drupal sites]We would need to evaluate how the Neologism module works if installed into existing D7 sites and identify any issues. Currently Neologism is built as an installation profile which installs an entire site that provides just a vocabulary editor. There might be some initialization which was previously done during the installation procedure which would now need to be done when the Neologism module is installed into existing sites. We need to make sure there are no issues faced when the module is installed or reinstalled into existing D7 sites.


August 8 - August 14 (1 week) [Buffer period]

Buffer for general Neologism bugfixing/improvements as identified throughout the project

August 15 - GSoC Ends.
End of Phase 3.


Deliverables :

  • A functional D7 Port of Neologism, which isntalls on existing sites without any major issues.
  • Updated Documentation of Neologism.
  • Documentation of the status of the module at the end of GSoC completes and the plan of action for the future.
  • List of known issues in the module.


Link to Discussion created in on Drupal Groups


I had already planned my idea well before GSoC. Thus, I was quick to draft my proposal initially on the Drupal GSoC-11 Group. You may find the discussion here : http://groups.drupal.org/node/136969

I also asked the members of the Semantic Web Group in Drupal-Groups to provide me feedback on my proposal. http://groups.drupal.org/node/137274


On the IRC Channels of drupal, (drupalcommerce and drupal-contribute) I got the opportunity to discuss my idea with a few people who provided me with useful bits of information and guidance.
Mentors:

I tried to contact the people who have been actively involved with the development of RDF and related Modules in Drupal 7. Lin Clark(linclark) (http://lin-clark.com/), suggested I get in touch with Richard Cyganiak (cygri) (http://richard.cyganiak.de/), for mentoring me on my project since it relates to the Neologism project(http://neologism.deri.ie/) he had started and has been working on. I contacted Richard and he generously agreed to mentor me on my GSoC project.

Lin Clark (linclark) has offered to help me during the first few weeks to learn the customs of using Drupal.org issue queue and creating clean patches.

Guido Cecilio, (guidocecilio) who is the current main developer of Neologism, will also be available to answer questions regarding Neologism code and coordinate his work with me.

Stephane Corlosquet (scor) has agreed to help by answering questions regarding D6 to D7 migration.

Thus, I have the overwhelming support of the Drupal community to assist me during the course of my project.

Sunday, April 17, 2011

Project Plan for BIOMOD-2011

I am a part of the team DA-NanoTrons, representing my college DA-IICT in BIOMOD-2011.

Our team is -
  • Faculty mentors
    • Manish K. Gupta [DA-IICT, Gandhinagar]
    • Taslimarif Saiyed [NCBS, Bangalore]

  • Team members 
    • Avinash Parida
    • Denny George
    • Mayank Kandpal 

Project plan :

Part 1 : (BIOMOD-2011)
Providing an interface for the users to input equations corresponding to 2-D shapes, which will generate a caDNAno friendly .json file output. This can directly be opened with cadnano and the structure can be further edited there. (So its like a basic cadnano template creator which can then be used and made into more complex structures on caDNAno)

The application will be a standalone for now and might be integrated into caDNAno later on.
We can even provide some default templates for some very basic equations.

Plan of Action :
1.1 Understand the format of cadnano gerenated json files and try to create simple files which are correctly displayed in cadnano. this would be done by creating some simple files in caDNAno and understanding the structure of files after saving them.  Initially dont worry about 3d, just create 2d .json file structures and run them on caDNAno.

1.2 Hack through the cadnano ActionScript code-base to understand their auto-stapling algorithm.

1.3 What would the program do :
    1.3.1 Take equation as input
    1.3.2 Generate the outline of the corresponding 2D shape
    1.3.3 Generate a single loop which fills the entire structure  
    1.3.3 Divide the loop into 7000 parts (there is a reason behind 7000)
    1.3.4 Select a point to break the loop and thus create a single long scaffold.
    1.3.5 Assign each division a base-pair (ACGT) ordered in the sequence of the standard M13mp18 virus DNA sequence. I have a rough visualization of the expected output after this stage, which I would share soon.
    1.3.6 Use the auto stapling algo to generate staples in the structure. (optional)
    1.3.7 Automatic StapleError correction feature (optional, will skip this most probably in Phase-1)
    1.3.8 Create the cadnano friendly json file corresponding to the structure and staples we generated.

How we could divide the work :
For parallely working on different things, we all need to be clear on how we would be storing the structure in each stage in the backend. In other words, what the output format/structure of each stage would look like.
For example, if we are clear initially how the backend would look like in step 1.3.6, then one person can start working directly on a manualy created output of step 1.3.6 and work on how to create a cadnano friendly json from the structure that we finally come up with.
Thus, before we begin with any coding, we need to be clear with what output we expect in each stage. For this, we need to first of all decide a platform which we would be working on. Considering the requirements. I am assuming java (or python) would be the best choice.
In case we find other platforms with better library support we would use that.  So first step is to hunt down the available libraries for each task.
Parallel task 1 : 1.3.1 - 1.3.5
Parallel task 2 : 1.3.6
Parallel task 3 : 1.3.7

FallBack Plan for part 1:
If we are too technically handicapped to understand the auto-stapling algorithm, we can simply skip step 1.3.6 and jump to 1.3.7, i,e, just generate the single long scaffold corresponding to the 2D structure represented by the equation and convert it to .json format which would can be opened in cadnano. The user can then use the autostapling feature within cadnano.


Part 2 : (to be done in the next year’s Biomod, OR if time permits(unlikely), within the current Biomod timeline)

2.1 Provide support for equations of 3D structures.

2.2 Either create Views Interface so that the user doesnt need to switch to cadnano just for checking out the output. OR port the entire application as a cadnano plugin itself.

Powered By Blogger
Custom Search