My PhD is spread across four departments at two Universities. Since everyone involved in research tends to be busy at different times it can be hard to keep the people I work with up to date with my progress. It's even harder to meet with people who I don't yet work with but who may have answers to questions I'm working on or even questions I haven't yet thought of.
This website is my first attempt at explaining the work that I do at the same time as documenting some of the progress I've made and the challenges I still face. I hope that it may lead to something that will help more people find it easier to join in with my research.
This is my first ever website, these are the first videos I've ever made and this is the first PhD I've ever done. I hope that I can improve all the above in time and I'd appreciate any feedback you can give.
Frameworks
A framework for creating metabolic networks is a rigid definition system. It allows us to be sure that we are accurately and reproducibly defining reactions, compounds and other network parameters in our model.
In this video I try to define what a framework is, what properties it requires and what mine consists of.
My aim is to look at the functioning of metabolic networks. Before we can start to look at how they function we need to have a way of defining the reactions that make them up. This tightly defined way of describing reactions is what I call a reaction framework and I’ll give an example here of what my chosen framework does.
Let's start with a single reaction, here the phosphorilation of beta-D-glucose by ATP. The first detail we need to add to this picture is the quantity of each compound involved in the reaction. We call these quantities the stoichiometries of the molecules in the reaction. In this case one of each compound is used.
Next we need to define a unique code for each compound. This is important so that computational models and humans don't get confused by multiple naming systems. As an example of why this is so important, consider that in English alone, ATP can be called ATP, Adenosine Triphosphate, Adenosine 5' Trisphosphate or even adenosine 5'-(tetrahydrogen triphosphate). We need to know that when we talk about any of these things, in any language, we are referring to the same compound. Within my framework, ATP is referred to as C00002 but as long as a framework is ordered and complete we could call this anything.
Our next step is to assign a unique code to the reaction shown. In my case I also link a single preferred name to this reaction code so that it's easier to keep track of reactions in larger networks.
In biological networks, reactions are almost always mediated by an enzyme. The exact enzyme and the gene coding for it depend on the organism and are not suitable for inclusion in a framework. Enzyme Classification or EC Numbers are a way of grouping enzymes together according to similar function. For this reaction, enzymes with EC numbers 2.7.1.1 and 2.7.1.2 are known mediators.
Metabolic networks consist or hundreds or even thousands of reactions. We know however that core metabolism is extremely well conserved across species and that many common groups of metabolic reactions are found in the majority of organisms. Where these groups of reactions perform well defined tasks we call them pathways. Examples of pathways are Phospholipid Biosynthesis and the Citrate Cycle. A framework can simplify the study of metabolic networks by defining reactions within pathways. This is particularly useful as we are often interested in the small differences between a particular pathway across species. In this example the reaction number R01600 is almost always found within the pathway called Glycolysis with the Pathway code Rn00010.
We can see now how a simple reaction actually requires a large amount of definition so let's look again at what a framework provides.
I've shown how a framework lets us define Stoichiometries, Compounds, Reactions, Enzymes and Pathways. The framework that I have designed has two further definitions that can used, Compartments which tell us where in the cell the reaction takes place and Kinetics which tell us how the reaction proceeds.
Now that we've defined our Framework, we're ready to use it.
Let's return to the single reaction we started with. This single reaction needs to be joined to many others to create the top half of the Glycolysis Pathway. Remember, a framework is just the basis of defining this larger network. In order to build the network we must first build tools to construct it and then design a file system to store it. I'll show you I've done that in the next video.
Tools for building networks on top of a framework I : The Reaction Creator
Once a solid framework is in place, we need tools to build and edit networks of reactions. The Reaction Creator is a piece of software I've written to do this.
Once a solid framework is in place, we need tools to build and edit networks of reactions. In my experience, existing software for this task is either so poor as to be useless or costs far too much and includes too many restrictions to be usable in research. Because of these difficulties I have had to create my own software based on a framework derived from the Kyoto Encyclopedia of Genes and Genomics, KEGG.
This framework means that all the stoichiometries, compounds, reactions, enzymes and pathways in the KEGG database are easily available to the user. We can that this fulfills the main requirements of a framework that I set out earlier.
The most important, although not the most obvious, task of any network creation tool is the ability to define new reactions not defined within the framework. Remember earlier that I said that my framework contained two extra details not necessary to a normal reaction framework. These were compartments and kinetics. The framework already contains all possible reactions within a given compartment but does not contain the reactions that move compounds from one compartment to another.
To illustrate what I mean by transport reactions, let's return to the network of the top half of glycolysis we've seen before. Since we know that this reaction occurs within an organism we can define the organism as a compartment and visualise it as I show here.
In order for this set of reactions to begin, the compound at the top of the pathway, D-Glucose, must be present within the cell. Since the organism must import glucose from outside its membrane we need to define a reaction that transports Glucose across the membrane.
These transport reactions are critical to the functioning of any model and must be defined by the user. KEGG does not consider compartments and this is why I needed to add this new detail to my framework.
Let's now look at the software I use to create new reactions. Before we look at the software, I should note that this application was fully developed, including learning all the programming principles behind it, in around 2 months. As such it's not yet in a state where it is easily usable by anyone not familiar with it. Nevertheless it is a powerful tool and I'll try and show you just some of the things that I can do with it
The first thing we're going to do is load up the application.
One of the things to notice is that this application contains all of the components of the framework, such as compounds and reactions, that I described earlier. The applications actually run in Microsoft Access which is just an easy to use database to store the framework in. Each of the components of the framework are held within tables within the database and linked as required, for example groups of stoichiometries and compounds make up reactions and groups of reactions make up pathways.
The software usually runs at a higher resolution than in this video, so here I'm making sure that the majority of the interface is visible.
Here we see the basic interface of the reaction creator. At the top left is an area where we can search for compounds by any of their common names. Remember what I said about the many names for ATP earlier, our aim has to be to remove theses complexities and ambiguities. If the user searches for a compound name that is correct but not preferred, the software will still be able to find it but a user can only insert the preferred name into their model. If the user knows the compound code already, they can quickly search for this too.
I'll do a search for glucose and you can see that there are a lot of compounds that contain the phrase glucose in their names. We can easily look through that list until we find what we want. At all times we can see the corresponding KEGG compound ID, that's the unique compound ID within our framework.
Here's another example of searching for compounds, but this time with oxygen. This shows us how important it is to have a rigid framework in place so that similar compounds are not confused. Just look at the number of compounds we could choose that at first could seem identical.
In this quick example what I'm actually going to do is define a transport reaction that moves ATP across a membrane.
First of all we need to search for ATP in the compound picker area. We can see we have a lot of options, but the basic ATP, C00002, is the one we want. We want this compound to start as a reactant in the extra cellular space. By clicking on the ‘add new record' button we add this compound to the list of reactants. The software now warns the user that they are creating a reaction with no products, violating the law of mass conservation. We now need to add another molecule of ATP as a product but this time in the Cytosol. By default, the stoichiometry of all reactants and products is one but this could be changed by the user at this stage. In transport reactions this can be invaluable, for example the Sodium-Potassium-Chloride transporter moves two Chloride ions across the membrane for each sodium and potassium ion..
Looking up at the top right of the reaction creator is the similar reactions section. This stops the user from creating the same custom reaction more than once. Within any framework it is extremeley important that each object, be it a compound or a reaction, is defined only once. Here we see that this transport reaction of ATP from the extra cellular space to the cytosol has already been defined and we shouldn't do it again.
If the reaction had not already been defined we would now need to type in a name and a unique reaction ID into the reaction naming section. By clicking the ‘Create the Reaction' button the newly defined reaction is added to the framework and can now be used in all metabolic models.
We've now seen how to add custom reactions, particularly transport reactions to the framework. In the next video we'll look at how we can combine reactions together to form networks.
Tools for building networks on top of a framework II : The Reaction Picker
With a solid framework and a way of adding extra reactions to the framework we are ready to combine reactions together to form networks. The Reaction Picker is software that I've written to build metabolic networks on top of a framework.
We've previously explored the framework of well defined reactions that lets us build models and we've just seen how we can create new reactions and add them to this framework. Now I'll show you how we can combine reactions together to form metabolic networks.
The first thing to do is to open up the reaction picker application.
This tool lets us build networks using reactions stored within the framework. That means reactions that were either imported from KEGG originally or that have since been created with the reaction creator.
Let's first look at the user interface. At the top left we have the reaction filter section which will let us filter the list of the reactions in the framework that we see just below it in the reaction chooser section. A wide variety of filters can be applied here.
Just below the reaction filter section is the reaction chooser which contains a long list of reactions. At the top of this list are reactions that we've recently defined using the reaction creator, towards the bottom are those that were imported from KEGG. In total there are around 9000 fully defined reactions but the number will grow as more transport reactions are defined. Returning to the top of the list we can see the ATP transfer reaction that defined in the previous video.
Let's now return to the reaction filter section and look for the reaction that we're looked at first of all. Reaction R01600 is the phosphorylation reaction we looked at, the conversion of beta-D-glucose to beta-D-glucose 6-phosphate through the reduction of ATP.
Since we know that the reaction is R01600 we type this in, press filter and then click on the reaction in the reaction chooser. Clicking on a reaction in the reaction chooser shows details of it in the reaction viewer. We can see the reaction ID, the preferred name, the list of reactants, and the list of products with the stoichiometries and the preferred compound names also shown with them.
Of course we don't always know the KEGG reaction ID and so the reaction picker application actually fills a very important role by letting us search for the reactions we want to put into our metabolic model. We're going to look again for R01600 but this time we'll pretend we don't know the reaction ID. We're going to type in the reactant name, we know that we want beta-d-glucose. By pressing filter we can see that there are lots of reactions that have beta-d-glucose as a reactant. One option now would be to click through all the listed reactions and inspect them individually until we are happy with our choice. We know however that we want the product of our reaction to contain the string 6-phosphate. By filtering with this parameter we now have 5 reactions to choose from.
Just like with the reaction creator, we can why a rigid framework so important to model building. The 5 reactions shown in the reaction chooser are very similar and yet the differences must be well defined so we don't choose the wrong one. In this case we want R01600 and the next thing I'm going to do is add it to our metabolic model.
Before I show you how to add this reaction to the model I want to show you the very simple metabolic network I'm constructing and explain the reactions that make it up.
This diagram shows the reactions that make up the top half of glycolysis enclosed within an imaginary cellular membrane. By using the filters in the reaction picker we can find the reaction IDs for all the reactions shown.
The next thing that we need to is allow the compounds at the edge of the network to move in and out of the cell we've defined. Using the reaction creator these transport reactions can be defined and given reaction IDs. Note that compounds in different compartments have a suffix added to them to denote which compartment they're in. This is all taken care of by my applications.
For the purposes of analysing this network we need to add a biomass or an objective reaction. This reaction can be thought of as the goal of the network, in this case the production of beta-D-fructose-1 6 biphosphate. This reaction converts the final product of the metabolic network into ADP, which can then be exported from the compartment by ADP_ctoex reaction. This stops beta-D-fructose-1 6 biphosphate accumulating within the cell and can be thought of as simulated growth.
Next we need to define boundary compounds and reactions. These are mathematical tricks that mean the cell has an infinite supply of alpha glucose, beta glucose and ATP and infinite drain of ADP. They could be thought of as simulated ingestion and excretion.
Finally we must give names to the two compartments we have in our diagram. In this case the cytosol and the extra cellular space.
Let's just take a moment to look at everything we're going to need to define in our metabolic model. If you haven't followed all the parts of this diagram don't worry, I'll go through them again later. I hope it's clear though that reaction networks are not simple to define and require discipline to work with.
Let's now return to the application picker programme and see how this model looks within the programme.
On the right hand side in the selected reactions section we see that I've already created the model I've just shown with the exception of R01600 which we're going to add now. Because this is a reaction that occurs within the cytosol within this model we choose the compartment as cytosol. Like almost all reactions in nature this one is reversible and so we choose the reversible 500 kinetic property. The 500 part of this choice is to do with the maximum flux that can be passed through this reaction when using Flux Balance Analysis on the network and I will try and explain that more fully in later videos on the analysis of networks.
Before we add the new reaction to our network it's worth looking at all the other reactions that have been defined. Note that the transfer reactions are defined as occurring within the transfer compartment. This is just a way of letting the application know that transfer reactions move compounds from one compartment to another.
Looking over at the selected reactions area there are three defined properties for each reaction. The single KEGG reaction ID fully defines the compounds, enzymes, stoichiometries, pathway and reaction by looking up these details from the framework. The compartment and kinetics properties are not included in the KEGG definition system and are therefore additional components in my framework. The three columns in the selected reactions area fully define the reaction's place within the model. The model itself is just the combination of all these reactions.
Let's continue and add the reaction. The program inserts reaction R01600, with the chosen compartment and kinetic properties into the metabolic model in the selected reactions area.
At this stage if we wanted to delete the newly added reaction, or any other reaction, we can just select it and press delete.
Now that I'm happy with my metabolic network model we need to export it for visualisation and analysis in other programs. My application exports all metabolic models in the Systems Biology Markup Language, SBML which we'll look at in the next video.
Coming Soon !
If you find these videos useful please do get in contact with me. My contact details are available on my profile page which is linked to at the top of this page.