Chromosome or group specific LD heatmaps

If you work on genomics, chances are you have to create an LD heatmap figure sooner or later. I am sure you have used the R package LDheatmap or have thought of using it at some point. The only problem is that despite being a sweet (not necessarily fast) program, it does have its limitations. The biggest one being production of publication-worthy graphics.

To give you an example, look at the following LD heatmap it produced with my mock data. My mock data has 7 chromosomes with 10 markers on each chromosome.

Here is the input data in Excel format.

The R code to obtain the plot above is:

setwd('C:\\Users\\username\\directory')
library(genetics)
library(LDheatmap)

d <- read.table("d.txt", header=F, sep="\t")
dim(d)

# use the information obtained from the dim command in the following step to subset genotypic data:
dsnp <- d[,3:372]
d <- t(dsnp)
colnames(d) <- as.character(unlist(d[1,]))
d = d[-1, ]
d <- as.data.frame(d)
dnum <- ncol(d)
for(o in 1:dnum){d[,o]<-as.genotype(d[,o])}
class(d[,1]) #check that your genotypes are truly genotype objects</pre>
<pre># subset the marker distance/genomic map
ddist <- d[,2]
ddist <- as.numeric(ddist)

# run LDheatmap and draw the plot
dHeatmap <- LDheatmap(d, ddist, LDmeasure = "r", add.map = F, add.key=F)

This, as I said, makes the plot above. But, when you are running the code across all chromosomes in a genome, then you want to see a clearer picture of what is happening in pairwise chromosomes. For example, I work on wheat which has 3 sub-genomes (A, B, D) and 21 chromosomes (1-7A, 1-7B, 1-7D). Typically, I want to see relationship among chromosomes within a sub-genome drawn out in a bit more practical and self-obvious way. Clearly, the output of LDheatmap is not enough in that case. I could (& have in the past) export the LD matrix and drawn the figure in Excel using conditional formatting. However, this approach is not pragmatic as we are moving towards using tens of thousands of markers. In certain crops and animals, they already are using millions of markers. Excel cannot handle much data at all so come back to R we must.

Someone might say, why not use a subset of fewer markers that are perhaps 100 bp or 1Kbp apart. Yes, while that would be one good way to deal with the problem, it is not the most elegant solution for two main reasons: 1) we lose resolution with fewer markers, and 2) this works usually if the LD is high enough in your species but not all species may have pre-calculated accurate LD estimates.

Anyway, back to the problem. Instead of relying on Excel, I use the ggplot2 package in R to give me a good looking figure in a way I want. Here is what I mean: As you can see, one can easily discern the pairwise relationships among different chromosomes now. Annotation of the plot with markers is not shown here mainly because I find it unnecessary. If you have 1000s of markers, displaying their names in legible way may not be possible. Besides, any & all important markers can either be listed in the caption, and discussed in the manuscript.

The code that produces the heatmap above:

library(ggplot2)
library(reshape2)

# first, extract the triangle matrix with LD (r2) values
dmatrix <- dHeatmap$LDmatrix
#write.table(dmatrix, "d_ld_matrix.txt", sep="\t", quote=F)

# then convert the ld matrix to a data frame
mat <- as.data.frame(dmatrix)

# reshape the dataframe for  ggplot2
mat <- melt(dmatrix, varnames = c('snp1', 'snp2'), na.rm = TRUE)
colnames(mat)[3] <- c("r2")

# add chromosome labels
# if your chromosome names are longer (or smaller), change the '2' to higher or lower digits
# chromosome names from the first column (assume X-axis)
mat$chr <- substr(mat$snp1,1,2)
# chromosome names from the second column (assume Y-axis)
mat$chr1 <- substr(mat$snp2,1,2)

# produce the heatmap
ggplot(mat, aes(snp1, snp2)) +
  geom_tile(aes(fill = r2), color='white') +
  scale_fill_gradient(low = 'pink', high = 'darkblue', space = 'Lab', name='Color scale\n', limits=c(0,1), breaks=c(0.0,0.25,0.5,0.75,1.0)) +
  facet_grid(chr1 ~ chr, scales="free", space="free", as.table=F, switch="y") +
  theme_bw() +
  theme(axis.text=element_blank(),
        axis.title=element_blank(),
        axis.ticks=element_blank(),
        axis.line=element_blank(),
        panel.border=element_blank(),
        panel.grid.major=element_blank(),
        strip.text.y = element_text(angle = 180),
        strip.text = element_text(size=10),
        #strip.text.x = element_text(margin = margin(.1, 0, .1, 0, "cm")),
        strip.background = element_rect(fill = "white", colour = NA))

This being a demo, I didn’t put finishing touches in the final product. You can experiment with borders, colors, etc to get the figure you want.

And there you have it. Let me know if this works well for you. Thanks for visiting!

Advertisements

याँ मुन्तिर कमेन्ट ठोक्नुस्

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s