SCA6.0 - The S1A Serine Proteases --------------------------------- **Summary:** This script repeats the analysis of the S1A serine protease family, an example of a single protein containing at least three independent sectors. This analysis starts from the alignment used in: - Halabi, Rivoire, Leibler and Ranganathan. (2009) Cell 138:774-786. The goal is to examine the output of the pySCA (v6) analysis for the comparison to earlier results. In the S1A serine protease family, prior results indicate three sectors: "red" - associated with catalytic specificity, "green" - associated with catalytic function (the catalytic triad), and "blue" - associated with stability. To facilitate comparison, here are the S1A sector definitions as described in Halabi et al. in a format suitable for pasting into pyMol: \*\* Red:\*\* 17+161+172+176+177+180+183+187+188+189+191+192+213+215+216+220+221+226+227+228 \*\* Green:\*\* 19+33+42+43+55+56+57+58+102+141+142+184+194+195+196+197+198+199+213+214+216+225 \*\* Blue:\*\* 21+26+46+52+68+69+71+77+80+81+104+105+108+118+123+124+136+153+157+201+210+229+237+242+245 Prior to running this tutorial, the core calculation scripts must be executed from the command line as follows: >> ./scaProcessMSA.py Inputs/s1Ahalabi\_1470\_nosnakes.an -s 3TGI -c E -t -n >> ./scaCore.py Outputs/s1Ahalabi\_1470\_nosnakes.db >> ./scaSectorID.py Outputs/s1Ahalabi\_1470\_nosnakes.db Note that we supply pre-annotated alignments for all tutorial scripts *(the annotate\_pfMSA step is slow, and should only be run once)*. **O.Rivoire, K.Reynolds and R.Ranganathan** 10/2015 .. code:: python %matplotlib inline from __future__ import division import os import time import matplotlib.pyplot as plt import numpy as np import copy import scipy.cluster.hierarchy as sch from scipy.stats import scoreatpercentile import matplotlib.image as mpimg from IPython.display import display from IPython.display import Image from Bio.Seq import Seq from Bio import motifs import colorsys import scaTools as sca import mpld3 import cPickle as pickle from optparse import OptionParser if not os.path.exists('Outputs/'): os.makedirs('Outputs/') To begin, we read in the results of the above three scripts (scaProcessMSA, scaCore and scaSectorID), stored as three dictionaries in the database s1Ahalabi\_1470\_nosnakes.db .. code:: python db = pickle.load(open('Outputs/s1Ahalabi_1470_nosnakes.db','rb')) Dseq = db['sequence'] #the results of scaProcessMSA Dsca = db['sca'] #the results of scaCore Dsect = db['sector'] #the results of scaSectorID I. Alignment processing and composition ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ First, we print out a few statistics describing the alignment: .. code:: python print("After processing, the alignment size is %i sequences and %i positions" % \ (Dseq['Nseq'], Dseq['Npos'])) print("With sequence weights, there are %i effective sequences" % (Dseq['effseqs'])) .. parsed-literal:: After processing, the alignment size is 1344 sequences and 205 positions With sequence weights, there are 928 effective sequences To examine alignment composition, we plot a histogram of all pairwise sequence identities *(left panel)* and a global view of the sequence similarity matrix (defined by :math:`S\equiv \frac{1}{L}XX^\top`) *(right panel)*. The data show that the alignment is described by a nearly homogeneous distribution of sequence identities with a mean value of about 30%. .. code:: python # List all elements above the diagonal (i level] descr_dict = {k:descr_list.count(k) for k in descr_list \ if descr_list.count(k)>=atleast} print '\n Level %i:' % level print descr_dict # Most frequent catalytic specificities: print '\nCatalytic Specificities: ' for k in spec.keys(): if len(spec[k]) > 5: print k+': '+str(len(spec[k])) .. parsed-literal:: Level 0: {'Oligochaeta': 11, 'Malacostraca': 12, 'Actinopterygii': 127, 'Actinobacteria (class)': 14, 'Mammalia': 447, 'Amphibia': 58, 'Insecta': 564, 'Arachnida': 25} Level 1: {'Astigmata': 23, 'Decapoda': 12, 'Pleuronectiformes': 14, 'Carnivora': 45, 'Cypriniformes': 60, 'Anura': 58, 'Hymenoptera': 15, 'Ruminantia': 61, 'Coleoptera': 81, 'Tetraodontiformes': 19, 'Rodentia': 174, 'Perciformes': 12, 'Lepidoptera': 59, 'Actinomycetales': 14, 'Haplotaxida': 11, 'Primates': 142, 'Diptera': 391, 'Laurasiatheria': 14} Level 2: {'Streptomycetaceae': 12, 'Muridae': 170, 'Bovidae': 61, 'Lumbricidae': 11, 'Noctuidae': 28, 'Drosophilidae': 203, 'Tetraodontidae': 19, 'Tenebrionidae': 70, 'Pipidae': 58, 'Canidae': 44, 'Apidae': 10, 'Paralichthyidae': 12, 'Sarcoptidae': 14, 'Cyprinidae': 60, 'Suidae': 14, 'Culicidae': 153, 'Hominidae': 129, 'Cercopithecidae': 11} Catalytic Specificities: chymotrypsin: 85 kallikrein: 86 tryptase: 19 chymase: 8 allergen: 21 not trypsin: 34 thrombin: 7 plasminogen activator: 6 haptoglobin: 13 ELSE: 611 mast cell protease: 24 plasminogen: 6 marapsin: 6 granzyme: 52 elastase: 42 trypsin: 182 Based on this, we select both taxonomic groups and specificity classes, and associate them to colors for representation. We choose broad taxonomic groups (corresponding to Level 1), and well-populated specificity classes that are well-represented in the alignment . To see a complete legend that maps numeric codes to color, use: :: >>> sca.figColors() We start with taxonomic groups: .. code:: python phylo = list(); fam_names = ['Oligochaeta', 'Malacostraca','Actinopterygii','Actinobacteria',\ 'Mammalia','Amphibia','Insecta','Arachnida'] col = (0,0.12,0.18,0.38,0.5,0.60,0.78,0.90) #Oligochaeta = red, Malacostraca = orange, Actinopterygii = yellow, #Actinobacteria = green, Mammalia = cyan, Amphibia = dark blue, #Insecta = purple, Arachnida = bright pink for i,k in enumerate(fam_names): sf = sca.Unit() sf.name = fam_names[i].lower() sf.col = col[i] sf.items = [j for j,q in enumerate(Dseq['hd']) if sf.name in q.lower()] phylo.append(sf) Now we assign substrate specificity classes, and also sort sequences into catalytically active and inactive enzymes (the haptoglobins). .. code:: python spec_names = ['chymotrypsin','trypsin','tryptase', 'kallikrein', 'granzyme'] cat_act = ['active','haptoglobin'] col_spec = [0,0.12,0.38,0.5,0.60,0.9] #chymotrypsin = red, trypsin = orange, tryptase = green, #kallikrein = cyan, granzyme = bright pink col_vert = [0.12, 0.5] #vertebrates = orange, invertebrates = cyan col_act = [0.65,0.5] #active = blue, inactive (haptoglobin) = cyan spec_classes = []; cat_classes = []; vert_classes = []; for i,k in enumerate(spec_names): sp = sca.Unit() sp.col = col_spec[i] sp.name = k sp.items = spec[k] spec_classes.append(sp) for i,k in enumerate(cat_act): sp = sca.Unit() sp.col = col_act[i] sp.name = k if k == 'haptoglobin': sp.items = spec[k] else: sp.items = [k for k in range(Dseq['Nseq']) if (k not in spec['haptoglobin'] \ and k not in spec['ELSE'])] cat_classes.append(sp) for i,k in enumerate(vert.keys()): sp = sca.Unit() sp.col = col_vert[i] sp.name = k sp.items = vert[k] vert_classes.append(sp) To examine the relationship between **global** sequence similarity, phylogeny, and functional sub-class, we plot the top six independent components of the sequence correlation matrix (including sequence weights). In these plots, each point represents a particular sequence, and the distance between points reflects global sequence identity. The color codings are as follows: **top row:** phylogenetic annotation **second row:** active *(blue)* vs. inactive/haptoglobin *(cyan)* **third row:** specificity, chymotrypsin *(red)*, trypsin *(orange)*, tryptase *(green)*, kallikrein *(cyan)*, granzyme *(bright pink)* **fourth row:** vertebrate *(orange)* vs. invertebrate *(cyan)* The data show some seperation of particular phylogenetic groups along ICs 1-5. For example, a subset of mammalian sequences *(cyan)* seperate out along IC2. In contrast, S1A specificity and catalytic activity are poorly seperated/classified using the independent components of the global sequence similarity matrix. This provides one indication that S1A enzymatic specificity and catalysis are not well-described by global sequence identity, but are largely encoded in a subset of positions. We will later see that these sequences *can* be seperated by projecting the sequence space using the positional correlations. .. code:: python plt.rcParams['figure.figsize'] = 9, 13 U = Dsca['Uica'][1] pairs = [[2*i,2*i+1] for i in range(3)] for k,[k1,k2] in enumerate(pairs): plt.subplot(4,3,k+1) sca.figUnits(U[:,k1], U[:,k2], phylo) plt.xlabel(r"$IC_{seq}^{%i}$"%(k1+1), fontsize=16) plt.ylabel(r"$IC_{seq}^{%i}$"%(k2+1), fontsize=16) plt.subplot(4,3,k+4) sca.figUnits(U[:,k1], U[:,k2], cat_classes) plt.xlabel(r"$IC_{seq}^{%i}$"%(k1+1), fontsize=16) plt.ylabel(r"$IC_{seq}^{%i}$"%(k2+1), fontsize=16) plt.subplot(4,3,k+7) sca.figUnits(U[:,k1], U[:,k2], spec_classes) plt.xlabel(r"$IC_{seq}^{%i}$"%(k1+1), fontsize=16) plt.ylabel(r"$IC_{seq}^{%i}$"%(k2+1), fontsize=16) plt.subplot(4,3,k+10) sca.figUnits(U[:,k1], U[:,k2], vert_classes) plt.xlabel(r"$IC_{seq}^{%i}$"%(k1+1), fontsize=16) plt.ylabel(r"$IC_{seq}^{%i}$"%(k2+1), fontsize=16) plt.tight_layout() .. image:: SCA_S1A_files/SCA_S1A_17_0.png II. First-order statistics: position-specific conservation. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Plot the position-specific conservation values for each S1A protease position. :math:`D_i` is calculated according to equation S4 (supplemental information). .. code:: python fig, axs = plt.subplots(1,1, figsize=(9,4)) xvals = [i+1 for i in range(len(Dsca['Di']))] xticks = [0,50,100,150,200] plt.bar(xvals,Dsca['Di'], color='k') plt.tick_params(labelsize=11); plt.grid() axs.set_xticks(xticks); labels = [Dseq['ats'][k] for k in xticks] axs.set_xticklabels(labels); plt.xlabel('Amino acid position', fontsize=18); plt.ylabel('Di', fontsize=18); .. image:: SCA_S1A_files/SCA_S1A_20_0.png III. Second-order statistics: conserved correlations. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Plot the SCA correlation matrix ( :math:`\tilde{C_{ij}}` ) computed according to Equations 4 + 5 of Rivoire et al. .. code:: python plt.rcParams['figure.figsize'] = 9, 8 plt.imshow(Dsca['Csca'], vmin=0, vmax=1.4,interpolation='none',\ aspect='equal') .. parsed-literal:: .. image:: SCA_S1A_files/SCA_S1A_23_1.png Plot the eigenspectrum of (1) the SCA positional coevolution matrix (:math:`\tilde{C_{ij}}`) *(black bars)* and (2) 10 trials of matrix randomization for comparison. This graph is used to choose the number of significant eigenmodes (:math:`k^* = 7`). .. code:: python plt.rcParams['figure.figsize'] = 9, 4 hist0, bins = np.histogram(Dsca['Lrand'].flatten(), bins=Dseq['Npos'], \ range=(0,Dsect['Lsca'].max())) hist1, bins = np.histogram(Dsect['Lsca'], bins=Dseq['Npos'], \ range=(0,Dsect['Lsca'].max())) plt.bar(bins[:-1], hist1, np.diff(bins),color='k') plt.plot(bins[:-1], hist0/Dsca['Ntrials'], 'r', linewidth=3) plt.tick_params(labelsize=11) plt.xlabel('Eigenvalues', fontsize=18); plt.ylabel('Numbers', fontsize=18); print 'Number of eigenmodes to keep is %i' %(Dsect['kpos']) #mpld3.display() .. parsed-literal:: Number of eigenmodes to keep is 7 .. image:: SCA_S1A_files/SCA_S1A_25_1.png Plot the top significant eigenmodes *(top row)* and associated independent components *(bottom row)*. The ICs are an optimally independent representation of the seven different residue groups. .. code:: python plt.rcParams['figure.figsize'] = 9.5, 5.5 pairs = [[0,1],[2,3],[4,5],[5,6]] EVs = Dsect['Vsca'] ICs = Dsect['Vpica'] for k,[k1,k2] in enumerate(pairs): plt.subplot(2,4,k+1) plt.plot(EVs[:,k1], EVs[:,k2], 'ok') plt.xlabel("EV%i"%(k1+1), fontsize=16) plt.ylabel("EV%i"%(k2+1), fontsize=16) plt.subplot(2,4,k+5) plt.plot(ICs[:,k1], ICs[:,k2], 'ok') plt.xlabel("IC%i"%(k1+1), fontsize=16) plt.ylabel("IC%i"%(k2+1), fontsize=16) plt.tight_layout() .. image:: SCA_S1A_files/SCA_S1A_27_0.png IV. Sector decomposition. ~~~~~~~~~~~~~~~~~~~~~~~~~~ To define the positions with significant contributions to each of the independent components (ICs), we make a empirical fit for each IC to the t-distribution and select positions with greater than a specified cutoff on the CDF. We choose :math:`p=0.95` as our cutoff. Note that since some positions might contribute significantly to more than one IC (an indication of non-independence of ICs), we apply a simple algorithm to assign such positions to one IC. Specifically, we assign positions to the IC with which it has the greatest degree of co-evolution.In the case of IC7, all of the positions are more strongly assigned to other ICs, so we end up with 6 ICs containing 97 total positions The data indicate generally good fits for the top seven ICs, and we return the positions contributing to each IC in a format suitable for cut and paste into PyMol. .. code:: python plt.rcParams['figure.figsize'] = 8, 8 Vpica = Dsect['Vpica'] for k in range(Dsect['kpos']): iqr = scoreatpercentile(Vpica[:,k],75) - scoreatpercentile(Vpica[:,k],25) binwidth=2*iqr*(len(Vpica)**(-0.33)) nbins=round((max(Vpica[:,k])-min(Vpica[:,k]))/binwidth) plt.subplot(Dsect['kpos'],1,k) h_params = plt.hist(Vpica[:,k], nbins) x_dist = np.linspace(min(h_params[1]), max(h_params[1]), num=100) plt.plot(x_dist,Dsect['scaled_pd'][k],'r',linewidth = 2) plt.plot([Dsect['cutoff'][k],Dsect['cutoff'][k]], [0,60], 'k--',linewidth = 1) plt.xlabel(r'$V^p_{%i}$'%(k+1), fontsize=14) plt.ylabel('Number', fontsize=14) for n,ipos in enumerate(Dsect['ics']): sort_ipos = sorted(ipos.items) ats_ipos = ([Dseq['ats'][s] for s in sort_ipos]) ic_pymol = ('+'.join(ats_ipos)) print('IC %i is composed of %i positions:' % (n+1,len(ats_ipos))) print(ic_pymol + "\n") .. parsed-literal:: IC 1 is composed of 32 positions: 16+19+28+42+43+44+54+55+56+57+58+102+140+141+142+155+168+182+191+193+194+195+196+197+198+199+211+214+220+225+237+238 IC 2 is composed of 22 positions: 29+138+160+161+172+176+177+180+183+184+188A+189+192+213+215+216+221+222+226+227+228+230 IC 3 is composed of 20 positions: 21+25+26+27+46+52+68+69+71+77+81+104+107+108+114+118+123+124+136+201 IC 4 is composed of 10 positions: 30+31+32+34+40+51+139+152+156+200 IC 5 is composed of 6 positions: 85+89+91+92+94+95 IC 6 is composed of 7 positions: 47+53+101+103+105+229+234 IC 7 is composed of 0 positions: .. parsed-literal:: /Users/kreynolds/anaconda/lib/python2.7/site-packages/matplotlib/axes/_subplots.py:69: MatplotlibDeprecationWarning: The use of 0 (which ends up being the _last_ sub-plot) is deprecated in 1.4 and will raise an error in 1.5 mplDeprecation) .. image:: SCA_S1A_files/SCA_S1A_30_2.png To define protein sectors, we examine the structure of the SCA positional correlation matrix with positions contributing to the top independent components (ICs) ordered by weight (*left panel*). This provides a basis to determine/interpret which ICs are truly statistically independent (defining an independent sector) and which represent hierarchical breakdowns of one sector. In this case, the data suggest that ICs 1-7 are indeed relatively independent. .. code:: python #plot the SCA positional correlation matrix, ordered by contribution to the top ICs plt.rcParams['figure.figsize'] = 10, 10 plt.subplot(121) plt.imshow(Dsca['Csca'][np.ix_(Dsect['sortedpos'], Dsect['sortedpos'])], \ vmin=0, vmax=2,interpolation='none',\ aspect='equal',extent=[0,sum(Dsect['icsize']),\ 0,sum(Dsect['icsize'])]) line_index=0 for i in range(Dsect['kpos']): plt.plot([line_index+Dsect['icsize'][i],line_index+Dsect['icsize'][i]],\ [0,sum(Dsect['icsize'])],'w', linewidth = 2) plt.plot([0,sum(Dsect['icsize'])],[sum(Dsect['icsize'])-\ line_index,sum(Dsect['icsize'])-line_index],'w', linewidth = 2) line_index += Dsect['icsize'][i] #define the new sector groupings - 3 total sec_groups = ([0],[1],[2],[3],[4],[5]) sectors = list() #defining color codes for the sectors #s1 = green, s2 = red, s3 = blue, s4 = orange, s5 = hot pink, s6 = cyan c = [0.4,0,0.7,0.15,0.9,0.5] for n,k in enumerate(sec_groups): s = sca.Unit() all_items = list() all_Vp = list() for i in k: all_items = all_items+Dsect['ics'][i].items all_Vp = all_Vp+list(Dsect['ics'][i].vect) svals = np.argsort(all_Vp) s.items = [all_items[i] for i in svals] s.col = c[n] sectors.append(s) #plot the re-ordered matrix plt.subplot(122) line_index=0 sortpos = list() for s in sectors: sortpos.extend(s.items) plt.imshow(Dsca['Csca'][np.ix_(sortpos, sortpos)], vmin=0, vmax=2.2,\ interpolation='none',aspect='equal',\ extent=[0,len(sortpos),0,len(sortpos)]) for s in sectors: plt.plot([line_index+len(s.items),line_index+len(s.items)],\ [0,len(sortpos)],'w', linewidth = 2) plt.plot([0,sum(Dsect['icsize'])],[len(sortpos)-line_index,\ len(sortpos)-line_index],'w', linewidth = 2) line_index += len(s.items) plt.tight_layout() .. image:: SCA_S1A_files/SCA_S1A_32_0.png Print the sector positions, in a format suitable for pyMol, and create a pyMol session (in the Outputs directory) with the sectors (and decomposition into independent components) as seperate objects. .. code:: python for i,k in enumerate(sectors): sort_ipos = sorted(k.items) ats_ipos = ([Dseq['ats'][s] for s in sort_ipos]) ic_pymol = ('+'.join(ats_ipos)) print('Sector %i is composed of %i positions:' % (i+1,len(ats_ipos))) print(ic_pymol + "\n") sca.writePymol('3TGI', sectors, Dsect['ics'], Dseq['ats'], \ 'Outputs/S1A.pml','E', '../Inputs/', 0) .. parsed-literal:: Sector 1 is composed of 32 positions: 16+19+28+42+43+44+54+55+56+57+58+102+140+141+142+155+168+182+191+193+194+195+196+197+198+199+211+214+220+225+237+238 Sector 2 is composed of 22 positions: 29+138+160+161+172+176+177+180+183+184+188A+189+192+213+215+216+221+222+226+227+228+230 Sector 3 is composed of 20 positions: 21+25+26+27+46+52+68+69+71+77+81+104+107+108+114+118+123+124+136+201 Sector 4 is composed of 10 positions: 30+31+32+34+40+51+139+152+156+200 Sector 5 is composed of 6 positions: 85+89+91+92+94+95 Sector 6 is composed of 7 positions: 47+53+101+103+105+229+234 These positions have clear physical consistency with the concept of sectors as functional, physically contiguous units in the protein structure. IC1 corresponds to the previously reported green sector, IC2 to the red sector, and IC3 is similar to the blue sector. ICs 4-6 follow the general principle of physical connectivity that seems to characterize sectors, but their functional meaning requires further study. V. Sequence subfamilies and the basis of sector hierarchy. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ How does the phylogenetic and functional heterogeneity in the MSA influence the sector definitions? To address this, we take advantage of mathematical methods for mapping between the space of positional and sequence correlations, as described in *Rivoire et al* (see equations 8-11). Using this mapping, we plot the top :math:`k^*` ICs of the matrix :math:`\tilde{C_{ij}}` as 2-D scatter plots *(top row)*, and compare them to the corresponding sequence space divergence *(middle and bottom rows)*. The amino acid positions contributing to each IC are colored by sector *(sector 1 = green, sector 2 = red, sector 3 = blue, sector 4 = orange, sector 5 = pink, sector 6 = cyan, all top row)*. The sequences are color-coded according to phylogenetic classifications *(row 2)*, specificity *(row 3)*, vertebrate/invertebrate *(row 4)*, or catalytic activity *(row 5)* as we defined above. .. code:: python plt.rcParams['figure.figsize'] = 18, 20 pairs= [[0,1],[1,2],[2,3],[4,5],[5,6]] for n,[k1,k2] in enumerate(pairs): plt.subplot(5,5,n+1) sca.figUnits(Dsect['Vpica'][:,k1], Dsect['Vpica'][:,k2], \ sectors, dotsize = 6) plt.xlabel('IC%i' % (k1+1), fontsize=16) plt.ylabel('IC%i' % (k2+1), fontsize=16) plt.subplot(5,5,n+6) sca.figUnits(Dsect['Upica'][:,k1], Dsect['Upica'][:,k2], \ phylo, dotsize = 6) plt.xlabel(r'$U^p_{%i}$' % (k1+1), fontsize=16) plt.ylabel(r'$U^p_{%i}$' % (k2+1), fontsize=16) plt.subplot(5,5,n+11) sca.figUnits(Dsect['Upica'][:,k1], Dsect['Upica'][:,k2], \ spec_classes, dotsize = 6) plt.xlabel(r'$U^p_{%i}$' % (k1+1), fontsize=16) plt.ylabel(r'$U^p_{%i}$' % (k2+1), fontsize=16) plt.subplot(5,5,n+16) sca.figUnits(Dsect['Upica'][:,k1], Dsect['Upica'][:,k2], \ vert_classes, dotsize = 6) plt.xlabel(r'$U^p_{%i}$' % (k1+1), fontsize=16) plt.ylabel(r'$U^p_{%i}$' % (k2+1), fontsize=16) plt.subplot(5,5,n+21) sca.figUnits(Dsect['Upica'][:,k1], Dsect['Upica'][:,k2], \ cat_classes, dotsize = 6) plt.xlabel(r'$U^p_{%i}$' % (k1+1), fontsize=16) plt.ylabel(r'$U^p_{%i}$' % (k2+1), fontsize=16) plt.tight_layout() .. image:: SCA_S1A_files/SCA_S1A_39_0.png As previously described, we see a clear correpsondence between the top three sectors (red, green, blue) and distinct functional properties of the S1A protease family. IC1 *(the green sector)* seperates the non-catalytic haptoglobins *(cyan, bottom row)* from the catalytic family members. IC2 *(the red sector)* seperates the proteases by specificity, and IC3 *(the blue sector)* seperates vertebrate from invertebrate sequences. The remaining ICs (4-6) correspond to small numbers of physically contiguous positions, with no obvious relationship to particular sequence functional groups. To more clearly see seperations in sequence classification, we also plot the above distributions of sequences as stacked bar plots. .. code:: python plt.rcParams['figure.figsize'] = 18, 12 col = list() for k in spec_classes: col = col + [colorsys.hsv_to_rgb(k.col,1,1)] for k in range(Dsect['kpos']): forhist = list() for group in spec_classes: forhist.append([Dsect['Upica'][i,k] for i in group.items]) plt.subplot(4,Dsect['kpos'],k+1) plt.hist(forhist, histtype='barstacked',color=col) col = list() for k in vert_classes: col = col + [colorsys.hsv_to_rgb(k.col,1,1)] for k in range(Dsect['kpos']): forhist = list() for group in vert_classes: forhist.append([Dsect['Upica'][i,k] for i in group.items]) plt.subplot(4,Dsect['kpos'],k+(Dsect['kpos'])+1) plt.hist(forhist, histtype='barstacked',color=col) col = list() for k in cat_classes: col = col + [colorsys.hsv_to_rgb(k.col,1,1)] for k in range(Dsect['kpos']): forhist = list() for group in cat_classes: forhist.append([Dsect['Upica'][i,k] for i in group.items]) plt.subplot(4,Dsect['kpos'],k+2*(Dsect['kpos'])+1) plt.hist(forhist, histtype='barstacked',color=col) col = list() for k in phylo: col = col + [colorsys.hsv_to_rgb(k.col,1,1)] for k in range(Dsect['kpos']): forhist = list() for group in phylo: forhist.append([Dsect['Upica'][i,k] for i in group.items]) plt.subplot(4,Dsect['kpos'],k+(3*Dsect['kpos'])+1) plt.hist(forhist, histtype='barstacked',color=col) .. image:: SCA_S1A_files/SCA_S1A_42_0.png