With data.table:
- find the smallest start and biggest end of a given gene ID
data.table(tmpDF)[,.(min(start), max(end)), .(gene_id)]
With by:
- e.g. find the smallest endof a given start
do.call(rbind, by(inData, inData$start, function(x) x[which.min(x$end),]))
- e.g. find the smallest start and biggest end of a given gene ID
do.call(rbind, by(tmpDF, tmpDF$gene_id, function(x) c(min(x$start), max(x$end))))