We have mapped a protein interaction network of human homologs of proteins that modify longevity in invertebrate species. This network is derived from a proteome-scale human protein interaction Core Network generated through unbiased high-throughput yeast two-hybrid searches. The longevity network is composed of 175 human homologs of proteins known to confer increased longevity through loss of function in yeast, nematode, or fly, and 2,163 additional human proteins that interact with these homologs. Overall, the network consists of 3,271 binary interactions among 2,338 unique proteins. A comparison of the average node degree of the human longevity homologs with random sets of proteins in the Core Network indicates that human homologs of longevity proteins are highly connected hubs with a mean node degree of 18.8 partners. Shortest path length analysis shows that proteins in this network are significantly more connected than would be expected by chance. To examine the relationship of this network to human aging phenotypes, we compared the genes encoding longevity network proteins to genes known to be changed transcriptionally during aging in human muscle. In the case of both the longevity protein homologs and their interactors, we observed enrichments for differentially expressed genes in the network. To determine whether homologs of human longevity interacting proteins can modulate life span in invertebrates, homologs of 18 human FRAP1 interacting proteins showing significant changes in human aging muscle were tested for effects on nematode life span using RNAi. Of 18 genes tested, 33% extended life span when knocked-down in Caenorhabditis elegans. These observations indicate that a broad class of longevity genes identified in invertebrate models of aging have relevance to human aging. They also indicate that the longevity protein interaction network presented here is enriched for novel conserved longevity proteins.
Studies of longevity in model organisms such as baker's yeast, roundworm, and fruit fly have clearly demonstrated that a diverse array of genetic mutations can result in increased life span. In fact, large-scale genetic screens have identified hundreds of genes that when mutated, knocked down, or deleted will significantly enhance longevity in these organisms. Despite great progress in understanding genetic and genomic determinants of life span in model organisms, the general relevance of invertebrate longevity genes to human aging and longevity has yet to be fully established. In this study, we show that human homologs of invertebrate longevity genes change in their expression levels during aging in human tissue. We also show that human genes encoding proteins that interact with human longevity homolog proteins are also changed in expression during human aging. These observations taken together indicate that the broad patterns underlying genetic control of life span in invertebrates is highly relevant to human aging and longevity. We also present a collection of novel candidate genes and proteins that may influence human life span.
Citation: Bell R, Hubbard A, Chettier R, Chen D, Miller JP, et al. (2009) A Human Protein Interaction Network Shows Conservation of Aging Processes between Human and Invertebrate Species. PLoS Genet 5(3): e1000414. doi:10.1371/journal.pgen.1000414
Editor: Stuart K. Kim, Stanford University Medical Center, United States of America
Received: September 25, 2008; Accepted: February 10, 2009; Published: March 13, 2009
Copyright: © 2009 Bell et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This work was supported by an NIH Nathan Shock Center grant (P30AG25708). PK is supported by NIH grant RL1AG032113. REH is supported by NIH grant RL1GM084432. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Genetic modulation of life span is ultimately mediated through proteins, and the mechanisms that allow this control must necessarily involve the interaction of multiple proteins. As a biological pathway, aging is a pleiotropic process, and many of the proteins identified as influencing this process have a proportionate pleiotropy of function. Modulations of the levels in a single protein have been found that provide robust increases in life-span for an organism ,, but contributions from many genes are expected to dictate longevity in all organisms. This idea is supported by an investigation of yeast protein-protein interaction networks that found that proteins related to aging have a significantly higher connectivity than expected by chance . Similarly, a second group found that their computational model suggested aging genes have more connections in interaction networks, and that this may be useful in identifying new aging genes . Therefore, a useful way to identify novel genes with roles that affect life span is to identify their gene product's interactions with known aging-associated proteins.
A role for protein interactions in processes is most apparent at the level of protein complexes that assemble to carry out a particular function. Likewise, protein interactions that mediate signaling cascades demonstrate how interactions functionally translate into a biological pathway. Indeed, biological processes are built of hierarchical protein-protein interaction assemblies that together carry out the overall physiological process. Therefore, the identification of interactions that a protein participates in can be an informative way to pursue an understanding of the protein's function. A common method for identifying protein interactions is the yeast two-hybrid system (Y2H), which uses the interaction of two proteins to reconstitute a transcription factor that then activates expression of a reporter gene . An important development in the Y2H approach was the introduction of the screening of libraries of potential interacting proteins . This development made it possible to identify novel protein interactions. Novel interactions impart a suggested role in a physiological process for proteins based on the established involvement of their interaction partner in that process.
Recently, high throughput approaches have expanded this idea to a systems-scale level: investigators can identify the network of interactions that occur among a large set of proteins, and from this infer the relationships of those proteins in, as well as their contribution to, the system. Such an approach has been used to interrogate the protein interaction networks that underlie model organisms –, human cells ,, and organisms responsible for infectious diseases –. Biological processes such as vulval development in nematodes , and familial neurodegenerative diseases – have also been the subject of large-scale Y2H interaction mapping. From these studies, many hypotheses for new participants in biological pathways have emerged.
The results from high-throughput protein interaction studies are known to contain false-positive (i.e. biologically irrelevant) interactions intermingled with the biologically relevant interactions. Independent large-scale studies of the same system may not necessarily distinguish the two , although detection of an interaction in more than one study is strong evidence for the authenticity of the interaction. An additional approach to address interaction validity is to use features of the network itself to provide evidence for the physiological relevance of the identified interactions. Protein interaction networks behave as scale-free networks, and the resultant properties such as path length and clustering features can be mined with bioinformatic methods to evaluate the properties of a given interaction within the network ,. Comparisons with other phenotypic data can provide further support. An observation of similar regulation using gene expression analysis has been used to establish confidence in protein interactions by a number of groups ,,,,. Shared gene ontology annotations  can also be used to identify characteristics of proteins that support the link(s) suggested by the interactions .
Interaction Network of Human Homologs of Invertebrate Longevity Proteins
We performed a comprehensive survey of the published literature on the genetics of aging as studied in model systems (yeast, fly, nematode and mouse) and identified 363 genes that have been reported to increase life span when mutated. Most of these genes were curated in the SAGE KE Genes/Interventions Database (http://sageke.sciencemag.org). The remainder were culled from published large-scale genetic screens for longevity phenotypes –. In order to characterize these longevity genes/proteins in the context of a human protein interaction network we sought to analyze their protein interaction partners in a large human protein interaction database. We have used high-throughput yeast two-hybrid methods to construct a large network comprised of 114,689 unique binary interactions between fragments of human proteins. This network was generated using results from ~345,000 individual yeast two hybrid screens. Aspects of the Prolexys human protein interaction network and methods used to generate it have been described previously ,,. The 114,689 interaction network was filtered to create a Core Network with 70,358 unique binary interactions between protein fragments representing 10,425 unique genes curated as NCBI RefSeq entries. The Core Network was generated by removal of “sticky” proteins identified using a K-means clustering method . Exclusion of bait proteins with >87 interactions and prey proteins with >231 interactions resulted in removal of 44,331 interactions and 855 nodes (i.e. unique genes) from the unfiltered network. Figure 1A shows a log-log graph of node degree distributions of the unfiltered network (black circles) and the Core Network (red circles). The fact that the degree distribution appears as a straight line on a logarithmic plot indicates that the Core Network is scale-free ,. This Core Network was queried to determine the interaction properties of human protein homologous to proteins experimentally implicated regulation of life span. A masked version of the complete Core Network is shown in Table S6.
Figure 1. Node Degree Distributions in Core Network and Longevity Network.
Panel A shows the node degree distribution in unfiltered and Core protein interaction networks. A log-log plot of node degrees in both unfiltered and Core interaction networks appears as a straight line indicating that both are scale free. Black circles represent node degrees of 11,280 proteins in a network of 114,698 interactions. Red circles show the node degree distribution after removal of bait proteins with >87 interactions and prey proteins with >231 interactions. The Core Network contains 70,358 binary interactions among 10,425 unique proteins. Panel B shows the node degree distributions of the Core and longevity networks represented as box plots. The average node degree in the Core Network is 13.5. The average node degree for the 175 longevity proteins is 18.8. Median node degrees (indicated by thick horizontal lines) for the core and longevity networks are respectively 5.0 and 7.0.doi:10.1371/journal.pgen.1000414.g001
The majority of genes and proteins identified as having a role in modulation of life span were discovered in yeast, fly and nematode. We therefore identified the human orthologs and homologs of these invertebrate longevity genes according to definitions used in NCBI's Homologene (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?DB=homologene) and the Karolinska Institute's Inparanoid Database (http://inparanoid.cgb.ki.se/). Of the 363 invertebrate longevity genes identified, 252 have human homologs and 175 of these homologs are represented in the Core Network of the Prolexys protein interaction database (Table S1). The proteins encoded by the 175 human homologs of invertebrate longevity genes were observed to interact with 2,163 additional human proteins in the yeast two-hybrid assays. This longevity protein interaction network ultimately consists of a total of 3,271 binary connections between the 2,338 proteins (Table S2). When the longevity network was derived from the Core Network it was immediately apparent that the longevity homologs were unusually highly connected with an average node degree of 18.8 and a median node degree of 7.0 (see Table S1 for individual node degrees). These values are notably higher that average and median node degrees of 13.5 and 5.0 observed for the entire Core Network (Table S6). Figure 1B shows a box plot comparing the distribution of node degrees for the 175 human longevity protein homologs and the Core Network from which the longevity sub-graph was derived. This indicates that human homologs of longevity proteins comprise a group of highly connected hubs in the Core Network. The increase in the median node degree for the longevity proteins indicates that this distribution is not due to the effect of outliers.
A path length analysis was performed to determine whether the network of longevity protein homologs were more closely connected to each other than would be expected by chance. Figure 2A shows the average mean shortest path length in 1,000 sets of 175 proteins selected at random from the Core Network is 4.61 as compared to 4.15 for the longevity network (p = 0.004). This result is consistent the prediction that proteins with shared functions (in this case the modification of life span) are more likely to be closely associated in the network than would be expected by chance. To determine whether this path length difference is a trivial result of the high average node degree in the longevity network, we performed a path length analysis using networks with randomized connections. In order to do this, the edges in the Core Network of 70,358 binary interactions were randomly reassigned while preserving the node degree of each individual protein. The average path length between the longevity protein homologs present in 100 randomized core interactomes was then determined. As shown in Figure 2B, we found that the average shortest distance between any two longevity proteins (4.15) is significantly less than the average distance of 4.73 (±0.13) between these proteins in the 100 networks with randomly assigned interactions (p<0.01). This result shows that the decreased path length observed in the longevity network is not simply a feature of high node degrees but is in fact dependent on the connections between the specific interacting proteins included in the longevity network.
Figure 2. Path length analysis of longevity genes in Core Network.
Panel A shows a comparison of the mean shortest path length of the 175 genes in the longevity cohort to the average shortest path length distribution in the Core Network. The histogram shows the distribution of mean shortest path lengths observed in 1,000 sets of 175 genes randomly selected from the 10,430 unique genes present as nodes in the Core Network. The mean shortest path length for all genes is 4.61. By comparison, the mean shortest path length for the 175 longevity genes is 4.15 (vertical red line). The p-value for the significance of this difference is 0.004. Panel B shows path length analysis for interactions among longevity homologs using randomized networks. The mean shortest path length between the 175 longevity protein homologs in the network is 4.15 (vertical red line). The distribution of mean shortest path lengths between these proteins in 100 networks with randomly assigned connections is shown. The peak of the distribution in the randomized networks is 4.73. As none of the values from the permutation distribution was less than 4.15, the p-value for the significance of this difference is <0.01.doi:10.1371/journal.pgen.1000414.g002
Expression of Genes Encoding Human Longevity Homologs and Their Interacting Proteins during Human Aging
The 2,163 human proteins that interact with the invertebrate longevity homologs are not known to be involved directly in aging or longevity phenotypes. However, because of their ability to bind directly to known longevity proteins in the yeast two-hybrid assay, these can be considered as candidate longevity proteins. To validate potential roles for the interacting proteins in human longevity we looked for evidence that the expression of genes encoding these proteins might be changed during the aging process. To do this, we compared the network to DNA microarray datasets comparing gene expression in human skeletal muscle from cohorts of young and old healthy volunteers . In this microarray study, skeletal (vastus lateralis) muscle biopsies from healthy older and younger adult men and women were compared using gene expression profiling. After quantile normalization, the number of genes significantly differentially expressed with age was determined by performing, on a probe-by-probe basis, 24,354 two-sample t-tests. To control the family-wise error rate (FWER), the significant genes were chosen at 5% using Holm's step-down method. FWER was used to insure a low probability of any false positives among this list. Using a false discovery rate cut-off of 5%, a large number of genes were found to be differentially expressed as a consequence of age .
To integrate the longevity interactome with the gene expression data, we asked whether any of the genes encoding longevity proteins or their interactors (“1° interactors”) were significantly changed in the transcript profiles from old vs. young human cohorts. Of the 175 longevity proteins, 169 were represented on the microarray used in this study by 210 probes. We determined how many of the 210 probes had a significant association of expression and age using analyses based on loess normalized intensities converted to log scale. HOPACH (Hierarchical Ordered Partitioning and Collapsing Hybrid) was then used to cluster the resulting genes and generate plots of similarly expressed genes. This analysis identified 54 of the 210 probes (52 of 169 unique genes) as being differentially expressed between the old and young cohorts (FDR q-value<0.05). The differentially expressed aging gene homologs are listed in Table 1.
Table 1. Human Longevity Gene Homologs with Significant Expression Changes in Muscle.doi:10.1371/journal.pgen.1000414.t001
To see whether this was unusual, we included an additional test to determine whether this set of probes is more enriched in genes associated with age than one would expect by pure chance. We drew randomly from the original list of genes probes (24,354 probes genes) 210 at a time and for each of these random draws, examined the number of genes probes significantly associated with age at the same level of significance. However, among only 1 of the 1,000 random draws we performed, did that many or more significant genes probes come up, implying a significant enrichment among this set (p-value = 0.001; see Figure 3B). A permutation test for all 236 gene longevity gene homologs present on the microarray (represented by 291 probes) is shown in Figure 3A. We found among the longevity gene homologs (regardless of whether they were present in the interaction network data), 66 out of 291 probes were significantly associated with age. However, among only 4 of the 1000 random draws we performed, did that many significant genes come up, implying a significant enrichment among this set as well (p = 0.004).
Figure 3. Significance of gene expression changes for longevity gene homologs and interacting proteins.
The permutation distributions (based on 1,000 permutations of the array label) for the number of significant probes (based on FDR value in the association of age versus expression) for three different sets: A. human homologs of aging genes (based on 1,000 random draws of 291 probes), B. longevity gene homologs present in the interaction network (based on 1,000 random draws of 210 probes), and C. 1° interactor protein genes (based on 1,000 random draws of 2,507 probes). Vertical red lines indicate values (number of probes with FDR-based q-value<0.05) for the original experimental datasets from which the p-values of these three tests are derived.doi:10.1371/journal.pgen.1000414.g003
We next evaluated the 2,507 probes that correspond to genes encoding 2,036 of the 2,163 1° interactors in the longevity interactome network. We repeated the analyses described above for the longevity proteins. Among the 1° interactors, 611 out of the 2,507 probes (581 of 2,036 genes) were significantly associated with age. In 1,000 random draws of 2,507 probes, none contained 611 (or more) significant genes, demonstrating significant enrichment among the set of 1° interactors (p<0.001; see Figure 3C).
These statistical analyses clearly demonstrate that genes encoding human homologs of invertebrate longevity genes and genes encoding their interacting proteins are highly enriched among genes with a statistically significant change in expression between young and old muscle tissue in human. This result is somewhat surprising in that these genes are derived primarily from experiments done in invertebrate models, and thus one might not expect a priori to see age-dependent changes in expression levels in human tissue. Two preliminary conclusions are suggested by these observations: 1) longevity genes discovered in invertebrate models are likely to play some roles in human longevity and 2) cells and tissues appear to modulate expression levels of such longevity genes during the aging process in human. A list of human homologs of invertebrate aging genes and genes encoding interacting proteins that show significant expression changes in aging human muscle are shown in Table S3.
Figure 4 shows a subnetwork of the longevity interactome. This subnetwork includes only those genes whose expression is significantly changed in the aging microarray data. This subnetwork contains 339 interactions among 325 proteins, roughly 10% of the interactions in the larger network. We consider proteins in this network to be of high interest for further studies. An example of one group of interest is FRAP1 (mTOR) and its interacting proteins. FRAP1 has total of 63 interacting protein interactions in the longevity network.
Figure 4. Subset of Longevity Network including only those genes whose expression is significantly changed in young vs old human muscle.
Longevity gene homologs are shown in red; interacting proteins are shown in green. The network contains 339 interactions among 325 proteins.doi:10.1371/journal.pgen.1000414.g004
FRAP1 has a well-established role in longevity, with loss of function mutations in the FRAP1 orthologs in both nematodes  and yeast , leading to increased life span. Our results suggest that FRAP1 may also have a role in human longevity. Human FRAP1 interacts with 63 proteins that have not previously been shown to be involved in longevity. Some of these have functions that are consistent with known FRAP1 functions of FRAP1, e.g. an interaction with RPS27, a component of the small ribosomal subunit may be related to the function of FRAP1 in translational control; similarly, nuclear import of FRAP1 is necessary for signaling through S6K and an interaction with TPR supports the idea that mTOR associates with the nuclear pore . Interestingly, mRNA levels for 24 of the 63 partners (38%) of FRAP1 are significantly different between young and old patient samples. Proteins that can interact with FRAP1 are thus frequently expressed differentially with age in human. FRAP1 interacting proteins that show significant changes in gene expression during aging in human muscle are shown in Table 2.
Table 2. Human FRAP1 Interacting Proteins with Significant Expression Changes in Muscle.doi:10.1371/journal.pgen.1000414.t002
To determine whether there is a relationship between protein interaction and a correlation in gene expression between protein pairs in this network, we compared the distribution of both negative and positive gene expression correlations with binary interactions. Figure 5 shows the distribution of gene expression correlations for the experimentally derived longevity network as compared to simulated networks of genes with randomly assigned binary connections. Both positively and negatively co-regulated protein pairs are enriched in the longevity interaction network relative to that observed in randomized networks. This observation supports the idea that interacting proteins are transcriptionally co-regulated . A list of the binary pairs with significant age-dependent transcriptional co-regulation is shown in Table S4.
Figure 5. Correlation of gene expression changes with binary protein interactions.
Distribution of transcriptional expression correlations for binary protein interaction pairs in the longevity network is shown in black. Distributions of correlation for randomized binary pairs is shown in red. The experimental network shows enrichment for both positively and negatively correlated binary pairs. Approximate inference via Two-sample Kolmogorov-Smirnov test confirms significant differences in the two distributions of correlations (p<0.00001).doi:10.1371/journal.pgen.1000414.g005
Validating Human FRAP1 Interacting Proteins in a C. elegans Life Span Assay
In order to test the hypothesis that interacting partners of human longevity homologs might themselves be longevity proteins we tested a group of these for effects on life span in C. elegans. The 24 FRAP1 interacting proteins with significant gene expression changes in aging human muscle are listed in Table 2. Of these 18 were tested for their ability to modulate life span in C elegans using RNAi mediated knock-down (six of 24 were not tested because reagents were not available in our RNAi library). Wild-type N2 C. elegans were fed E. coli expressing double-stranded RNA corresponding to genes encoding 18 FRAP1 interacting proteins and life spans were determined in two independent experiments. Of the 18 genes tested in this way, six reproducibly extended the life span of C. elegans by >10% (Figure 6). These genes are listed in Table 3. The gene showing the greatest effect on life span after RNAi treatment is RPS27. Knock-down of rps-27 expression in nematode resulted in 50% and 44% increases in life span in two independent experiments. Mammalian RPS27 encodes a zinc finger-containing protein component of the 40S ribosomal subunit . Several studies have established that TOR signaling can modulate life span in yeast , and fly . It has been demonstrated further that inhibition of translation can also extend life span indicating that loss-of-function in TOR signaling modulates aging through an effect on rates of translation –. Since RPS27 is a component of the ribosome and interacts with FRAP1 (Tor), it is likely that the life span extension seen in the rps-27 knock-down is due to an effect on rates of translation either through TOR signaling, direct effects on ribosome structure, or a combination of the two.
Figure 6. Kaplan–Meier survival curves for C. elegans treated with RNAi knock-down of genes encoding homologs of six human FRAP1 interacting proteins.
Human homologs corresponding to nematode genes are as follows: MAPKAP2 (C44C8.6); SART3 (B0035.12); ARS2 (E01A2.2); RPS27 (F56E10.4); HYPK (F13G3.10); DKFZP564F0522 (C33H5.10).doi:10.1371/journal.pgen.1000414.g006
Table 3. Effect of Knock-down of FRAP1 Interacting Proteins on C. elegans Life Span.doi:10.1371/journal.pgen.1000414.t003
The fact that 33% of the candidates tested had a significant effect on life span extension is noteworthy. Previous genome wide screens in C. elegans using RNAi have reported that less than 1% of the nematode genome may encode genes that can extend life span when knocked-down ,.
We present here a large protein interaction network comprised of human homologs of genes known to influence longevity in invertebrate systems and their interacting proteins. To compile this list of homologs, we selected genes that confer increased life span when mutated, deleted or knocked down in yeast, flies or nematodes. The longevity homolog sub-network (3,271 interactions) is derived from a much larger Core Network (70,358 interactions) that was generated in an unbiased fashion using a random high throughput yeast two hybrid process. The Core Network was generated from larger network after removal of sticky proteins with very high node degrees ,,. Analysis of the human longevity interactome presented here show that the 175 human longevity homologs are more closely connected that would be expected by chance, with a mean path length of 4.15 as compared to and average of 4.61 in the Core Network. Another striking feature of human homologs of invertebrate longevity proteins is their exceptionally high average node degree of 18.8 (as compared to an average of 13.5 in the Core Network). This observation indicates that human longevity protein homologs may function as hub proteins in the human interactome ,. The fact that longevity proteins are hubs may be indicative of their having a central role in cellular function. They may also function as nodes that connect and/or integrate functionally diverse cellular components and systems. It is interesting to consider the possibility that knock-down of these longevity genes may extend life span through a mechanism that involves uncoupling connections between cellular components of diverse function.
A striking conclusion of this study is dramatic degree of enrichment for genes encoding network proteins among genes that are transcriptionally modulated during aging in human muscle tissue. This correlation indicates that the network is enriched for proteins involved in human aging. This conclusion is consistent with the observation that human proteins interacting with the longevity homolog FRAP1 can increase life span when knocked-down in C.elegans. Overall these results provide evidence that the broad class of longevity proteins identified in invertebrates have a conserved role in processes of human aging and longevity.
Materials and Methods
Bait and Prey Construction
Complementary DNA was generated from poly(A)+ RNA isolated from multiple human tissues (including adult brain, fetal brain and liver) and inserted between the Gal4 transcriptional activation domain and the Schizosaccharomyces pombe URA4 coding region of pOAD.102 (prey plasmid) or the Gal4 DNA-binding domain and the S. cerevisiae MET2 coding region of pOBD.111 (bait plasmid). Yeast transformed with bait or prey plasmids were plated on medium lacking uracil (prey) or methionine (bait) to select for transformants expressing the markers fused to the cDNA inserts. Additional information about the plasmids, yeast strains and library construction can be found in Supplementary Information.
Automated Yeast Two-Hybrid Screening Process
The two-hybrid expression plasmids, pOBD.111 and pOAD.102 used in this study have been described . pOBD.111 and pOAD.102 are modifications of pOBD and pOAD . The bait and prey yeast strains used were respectively, R2HMet (MATα ura3-52 ade2-101 trp1-901 leu2-3, his3-200 met2Δ::hisG gal4Δ gal80Δ) and BK100 (MATa ura3-52 ade2-101 trp1-901 leu2-3,112 his3-200 gal4Δ gal80Δ GAL2-ADE2 LYS2::GAL1-HIS3 met2::GAL7-lacZ), a derivative of PJ69-4A . Bait and prey cDNA libraries were made using poly(A)+ RNA prepared from human tissues (see Table S5) by random primed cDNA synthesis followed by the PCR addition of yeast recombination tails. Both bait and prey cDNAs are cloned as a double fusion between the two-hybrid domain on the 5′ end of the insert and an ORF-selection marker on the 3′ end. Specifically, bait cDNA inserts were cloned between the GAL4 DNA binding domain and the TRP1 or MET2 coding regions, and prey inserts between the GAL4 transcriptional activation domain and URA3 . These cDNAs were then cloned into linearized expression vectors by recombination in yeast . Yeast transformed with bait were plated on medium lacking tryptophan or methionine to select for in-frame TRP1 or MET2 fusions, respectively, and prey were selected without uracil for in-frame URA3 fusions.
Y2H screens were performed in 96-well plates by mating in each well 5×106 cells of a yeast clone expressing a single bait with 5×106 clonally diverse cells from a prey library. After mating overnight, the Matings were plated using a Genesis Workstation 150 liquid handling robot (Tecan) onto medium that selected simultaneously for the mating event, the expression of the ORF-selection markers, and the activity of the metabolic reporter genes, ADE2 and HIS3. Yeast that grew on this selection medium (“positives”) were counted and transferred into liquid medium in a 96-well format using a MegaPix colony picking robot (Genetix). A maximum of 48 colonies per mating were picked. Searches that yielded more than 200 positives (~2% of all searches) were considered to result from bait plasmids that activated transcription in the absence of specific protein-protein interactions, and were not analyzed further. Cloned inserts were amplified from plasmid PCR. Liquid cultures grown from positive yeast colonies were used as templates in PCR reactions that amplified either both bait and prey cDNA inserts, or prey inserts only in screens in which the baits had been sequenced before the matings. The PCR reactions were assembled in 384-well format using the Genesis Workstation 150 or a custom built (Zymark) PCR workstation that included a SciClone ALH 500 liquid handling robot (Zymark). PCR amplification took place in Primus-HT thermocyclers (MWG Biotech). The amplicons served as templates in DNA sequencing reactions. Identities of insert fragments were established by querying against the NCBI RefSeq database. The Y2H protein-protein interaction database is the result of two distinct workflow modes referred to as random and directed. In the random mode individual bait clones are picked randomly from a library and mated with a library of prey cDNAs. Directed searches, on the other hand, are matings of prey libraries with a single intentionally constructed bait cDNA clone whose identity is known a priori. In random searches, moreover, the identity of the bait is discovered – depending, again, on a particular workflow – either before or after the mating has been performed. The alternatives are to sequence both the bait and prey from Y2H positives (called positive-derived sequence) or to sequence the bait plasmid before mating (called pre-sequencing) requiring only the prey to be sequenced from positive diploids. All Y2H search data and DNA sequences used to determine interaction pairs reported in this study are included in Table S5.
A total of 363 genes that had been reported to increase life span when mutated yeast, fly, nematode and mouse species were compiled from SAGE KE and the published literature. We then screened for their respective clusters in Homologene and Inparanoid databases. The human genes among those clusters were deemed to be the orthologs of the respective invertebrate genes. Any additional human paralogs were also taken into consideration. The 363 invertebrate genes have homology to genes had human ortholog/paralog which resulted in a total of 252 human genes.
k-means clustering (k = 2) was applied sequentially to prey and baits in the core protein interaction database to define two populations of genes based on their number of partners . Those interactions involving genes (i.e. baits with >87 interactions and preys with >231 interactions) were deemed promiscuous by this analysis and removed from the final dataset. The remaining interactions were referred to as the “Core Network”. The unfiltered core interactome had a total of 120,779 interactions involving 11,327 genes curated as NCBI Gene entries. The Core Network after filtering comprised of 71,814 interactions from 10,430 genes. The aging interactome reported here includes only interactions from the Core Network.
Network Topology Analysis
To establish the basis for suitable null hypotheses, the process of deriving subnetworks from the large interaction network was performed 1000 times with sets of 175 genes randomly selected from one of two sources: 1) any gene contained in the Y2H PPI database or 2) genes in either Homologene or InParanoid having homologs of C. elegans, D. melanogaster or S. cerevisiae. Because the latter set corresponds to genes conserved from phylogenetically distant organisms it is referred to as “ancient.” In each iteration of the process, the 175 genes were used to query the Y2H PPI database and create subnetworks in a manner otherwise identical to that of the procedure for longevity homologs.
The mean shortest path length between any two aging genes in the actual longevity network was calculated. We simulated the Core Network 100 times, by rewiring the edges, preserving the node degree of each protein. The aging related human genes were then screened through 100 randomized networks, to generate 100 simulated longevity networks. We then calculated the mean shortest path length between any two aging genes in the 100 randomized networks. A one sided t-test was used to compare mean shortest path lengths of the experimentally derived data to those of 100 randomizations.
Gene Expression Data Analysis
No background correction was performed given the very low levels of background intensity, however we performed loess normalization  on the entire set of probes to account for differences in the distribution of intensities among arrays. To select the genes that are differentially expressed with regards to age among the probes that matched our set of longevity proteins we performed, gene by gene, simple two-sample t-tests and used the Benjamini-Hochberg procedure  to derive adjusted q-values for the list of genes ranked by statistical significance. After deriving the number of significantly differentially expressed genes (based on an FDR cut-off of 5%), we wished to determine if this set of probes was significantly enriched with genes whose expression changes related to age, which motivated a permutation test to find whether the identification of a gene is related to life span extension was independent of differential expression with regards to the microarray data on muscle tissue in old and young subjects. We simply performed a large number of permutations on the longevity protein label for the total set of probes, each permutation randomly designated genes as either longevity protein genes or not and then among this random set, we performed the same procedure to find the number of significantly differentially expressed genes. After 1000 permutations, we have 1000 randomly generated numbers of significantly differentially expressed genes and we can compare our observed number to this null distribution to find the p-value of the test that these genes (related to life extension) or unrelated to age in human muscle. We performed an identical analysis for the 1° interactor genes.
To examine whether probes for genes encoding binary interaction pairs had more evidence of co-regulation in the microarray data, we examined correlation of log2 expression of probes of pairs of genes that were 1) connected directly and randomly chosen equal number of pairs of probes for pairs of genes unconnected in the network from the total list of probes on the Illumina array. For genes connected in the interactome represented by more than one probe, the correlation of all relevant pairs of probes were estimated (i.e., if there were 3 probes in one gene matched with 2 probes in another, this generated a total of 6 correlations). The purpose of this was to determine whether genes connected in the interactome were more related in expression than randomly drawn pairs of genes.
C. elegans Life Span Assays
Animals were grown on NGM agar plates seeded with OP50 E. coli at 20°C. RNAi bacteria strains were cultured as previously described . Wild-type N2 animals at the late L4 larval stage were fed with E. coli expressing different double-stranded RNAs and incubated at 25°C for life span experiments. 5-fluorodeoxyuridine (0.05 mg/ml) was added onto plates during the reproductive phase to eliminate progeny. Animals were transferred onto fresh plates every 3–6 days. The first day of adulthood is Day 1 in survival curves. Animals were scored as alive, dead or lost every other day. Animals that did not move in response to touching were scored as dead. Animals that died from causes other than aging, such as sticking to the plate walls, internal hatching or bursting in the vulval region, were scored as lost. In all life span assays, E. coli carrying the empty RNAi vector L4440 was fed to animals as controls. Statistical analyses were performed using the Prism 4 software (Graphpad Software, Inc., San Diego, CA, USA). Kaplan–Meier survival curves were plotted for each life span experiment and p values were calculated using the log-rank test .
Invertebrate longevity genes and human homologs.
(0.11 MB XLS)
Longevity Interaction Network.
(0.20 MB XLS)
Significant expression changes in genes encoding Longevity Network proteins.
(0.19 MB XLS)
Correlation in expression changes for genes encoding binary interactors.
(0.41 MB XLS)
Yeast 2-hybrid screening data.
(11.69 MB XLS)
Coded Core Network.
(1.00 MB CSV)
We thank Andy Strand for comments on the manuscript. SS is the Chief Scientific Officer of Prolexys Pharmaceuticals. REH would like to acknowledge support from the Buck Trust.
Conceived and designed the experiments: RB AH RC DC SM REH. Performed the experiments: DC MT. Analyzed the data: RB AH RC DC JPM SM REH. Contributed reagents/materials/analysis tools: DC PK MT SS SM. Wrote the paper: JPM REH.
- 1. Fabrizio P, Pozza F, Pletcher SD, Gendron CM, Longo VD (2001) Regulation of longevity and stress resistance by Sch9 in yeast. Science 292: 288–290.
- 2. Friedman DB, Johnson TE (1988) A mutation in the age-1 gene in Caenorhabditis elegans lengthens life and reduces hermaphrodite fertility. Genetics 118: 75–86.
- 3. Promislow DE (2004) Protein networks, pleiotropy and the evolution of senescence. Proc Biol Sci 271: 1225–1234.
- 4. Ferrarini L, Bertelli L, Feala J, McCulloch AD, Paternostro G (2005) A more efficient search strategy for aging genes based on connectivity. Bioinformatics 21: 338–348.
- 5. Fields S, Song O (1989) A novel genetic system to detect protein-protein interactions. Nature 340: 245–246.
- 6. Chien CT, Bartel PL, Sternglanz R, Fields S (1991) The two-hybrid system: a method to identify and clone genes for proteins that interact with a protein of interest. Proc Natl Acad Sci U S A 88: 9578–9582.
- 7. Gavin AC, Bosche M, Krause R, Grandi P, Marzioch M, et al. (2002) Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415: 141–147.
- 8. Giot L, Bader JS, Brouwer C, Chaudhuri A, Kuang B, et al. (2003) A protein interaction map of Drosophila melanogaster. Science 302: 1727–1736.
- 9. Ho Y, Gruhler A, Heilbut A, Bader GD, Moore L, et al. (2002) Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415: 180–183.
- 10. Ito T, Chiba T, Ozawa R, Yoshida M, Hattori M, et al. (2001) A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc Natl Acad Sci U S A 98: 4569–4574.
- 11. Li S, Armstrong CM, Bertin N, Ge H, Milstein S, et al. (2004) A map of the interactome network of the metazoan C. elegans. Science 303: 540–543.
- 12. Uetz P, Giot L, Cagney G, Mansfield TA, Judson RS, et al. (2000) A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403: 623–627.
- 13. Rual JF, Venkatesan K, Hao T, Hirozane-Kishikawa T, Dricot A, et al. (2005) Towards a proteome-scale map of the human protein-protein interaction network. Nature 437: 1173–1178.
- 14. Stelzl U, Worm U, Lalowski M, Haenig C, Brembeck FH, et al. (2005) A human protein-protein interaction network: a resource for annotating the proteome. Cell 122: 957–968.
- 15. LaCount DJ, Vignali M, Chettier R, Phansalkar A, Bell R, et al. (2005) A protein interaction network of the malaria parasite Plasmodium falciparum. Nature 438: 103–107.
- 16. McCraith S, Holtzman T, Moss B, Fields S (2000) Genome-wide analysis of vaccinia virus protein-protein interactions. Proc Natl Acad Sci U S A 97: 4879–4884.
- 17. Rain JC, Selig L, De Reuse H, Battaglia V, Reverdy C, et al. (2001) The protein-protein interaction map of Helicobacter pylori. Nature 409: 211–215.
- 18. Walhout AJ, Sordella R, Lu X, Hartley JL, Temple GF, et al. (2000) Protein interaction mapping in C. elegans using proteins involved in vulval development. Science 287: 116–122.
- 19. Goehler H, Lalowski M, Stelzl U, Waelter S, Stroedicke M, et al. (2004) A protein interaction network links GIT1, an enhancer of huntingtin aggregation, to Huntington's disease. Mol Cell 15: 853–865.
- 20. Lim J, Hao T, Shaw C, Patel AJ, Szabo G, et al. (2006) A protein-protein interaction network for human inherited ataxias and disorders of Purkinje cell degeneration. Cell 125: 801–814.
- 21. Kaltenbach LS, Romero E, Becklin RR, Chettier R, Bell R, et al. (2007) Huntingtin interacting proteins are genetic modifiers of neurodegeneration. PLoS Genet 3: e82.
- 22. von Mering C, Krause R, Snel B, Cornell M, Oliver SG, et al. (2002) Comparative assessment of large-scale data sets of protein-protein interactions. Nature 417: 399–403.
- 23. Barabasi AL, Oltvai ZN (2004) Network biology: understanding the cell's functional organization. Nat Rev Genet 5: 101–113.
- 24. Rives AW, Galitski T (2003) Modular organization of cellular networks. Proc Natl Acad Sci U S A 100: 1128–1133.
- 25. Deane CM, Salwinski L, Xenarios I, Eisenberg D (2002) Protein interactions: two methods for assessment of the reliability of high throughput observations. Mol Cell Proteomics 1: 349–356.
- 26. Ge Y, Jensen TL, Matherly LH, Taub JW (2003) Physical and functional interactions between USF and Sp1 proteins regulate human deoxycytidine kinase promoter activity. J Biol Chem 278: 49901–49910.
- 27. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, et al. (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25: 25–29.
- 28. Hamilton B, Dong Y, Shindo M, Liu W, Odell I, et al. (2005) A systematic RNAi screen for longevity genes in C. elegans. Genes Dev 19: 1544–1555.
- 29. Hansen M, Hsu AL, Dillin A, Kenyon C (2005) New genes tied to endocrine, metabolic, and dietary regulation of life span from a Caenorhabditis elegans genomic RNAi screen. PLoS Genet 1: 119–128.
- 30. Kaeberlein M, Powers RW 3rd, Steffen KK, Westman EA, Hu D, et al. (2005) Regulation of yeast replicative life span by TOR and Sch9 in response to nutrients. Science 310: 1193–1196.
- 31. Lee CK, Allison DB, Brand J, Weindruch R, Prolla TA (2002) Transcriptional profiles associated with aging and middle age-onset caloric restriction in mouse hearts. Proc Natl Acad Sci U S A 99: 14988–14993.
- 32. Powers RW 3rd, Kaeberlein M, Caldwell SD, Kennedy BK, Fields S (2006) Extension of chronological life span in yeast by decreased TOR pathway signaling. Genes Dev 20: 174–184.
- 33. Mukherji M, Bell R, Supekova L, Wang Y, Orth AP, et al. (2006) Genome-wide functional analysis of human cell-cycle regulators. Proc Natl Acad Sci U S A 103: 14819–14824.
- 34. Albert R (2005) Scale-free networks in cell biology. J Cell Sci 118: 4947–4957.
- 35. Melov S, Tarnopolsky MA, Beckman K, Felkey K, Hubbard A (2007) Resistance exercise reverses aging in human skeletal muscle. PLoS ONE 2: e465.
- 36. Vellai T, Takacs-Vellai K, Zhang Y, Kovacs AL, Orosz L, et al. (2003) Genetics: influence of TOR kinase on life span in C. elegans. Nature 426: 620.
- 37. Bachmann RA, Kim JH, Wu AL, Park IH, Chen J (2006) A nuclear transport signal in mammalian target of rapamycin is critical for its cytoplasmic signaling to S6 kinase 1. J Biol Chem 281: 7357–7363.
- 38. Tan K, Shlomi T, Feizi H, Ideker T, Sharan R (2007) Transcriptional regulation of protein complexes within and across species. Proc Natl Acad Sci U S A 104: 1283–1288.
- 39. Chan YL, Suzuki K, Olvera J, Wool IG (1993) Zinc finger-like motifs in rat ribosomal proteins S27 and S29. Nucleic Acids Res 21: 649–655.
- 40. Kapahi P, Zid BM, Harper T, Koslover D, Sapin V, et al. (2004) Regulation of life span in Drosophila by modulation of genes in the TOR signaling pathway. Curr Biol 14: 885–890.
- 41. Hansen M, Taubert S, Crawford D, Libina N, Lee SJ, et al. (2007) Life span extension by conditions that inhibit translation in Caenorhabditis elegans. Aging Cell 6: 95–110.
- 42. Pan KZ, Palter JE, Rogers AN, Olsen A, Chen D, et al. (2007) Inhibition of mRNA translation extends life span in Caenorhabditis elegans. Aging Cell 6: 111–119.
- 43. Syntichaki P, Troulinaki K, Tavernarakis N (2007) eIF4E function in somatic cells modulates ageing in Caenorhabditis elegans. Nature 445: 922–926.
- 44. Ekman D, Light S, Bjorklund AK, Elofsson A (2006) What properties characterize the hub proteins of the protein-protein interaction network of Saccharomyces cerevisiae? Genome Biol 7: R45.
- 45. Han JD, Bertin N, Hao T, Goldberg DS, Berriz GF, et al. (2004) Evidence for dynamically organized modularity in the yeast protein-protein interaction network. Nature 430: 88–93.
- 46. Hudson JR Jr, Dawson EP, Rushing KL, Jackson CH, Lockshon D, et al. (1997) The complete set of predicted genes from Saccharomyces cerevisiae in a readily usable form. Genome Res 7: 1169–1173.
- 47. James P, Halladay J, Craig EA (1996) Genomic libraries and a host strain designed for highly efficient two-hybrid selection in yeast. Genetics 144: 1425–1436.
- 48. Dudoit S, Yang YH, Callow MJ, Speed TP (2002) Statistical Methods for Identifying Genes with Differential Expression in Replicated cDNA Microarray Experiments. Stat Sin 12: 111–139.
- 49. Benjamini YaH, Y (1995) Controlling the False Discovery Rate- A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society Series B-Methodological 51: 289–300.
- 50. Kamath RS, Martinez-Campos M, Zipperlen P, Fraser AG, Ahringer J (2001) Effectiveness of specific RNA-mediated interference through ingested double-stranded RNA in Caenorhabditis elegans. Genome Biol 2: RESEARCH0002.