Have we got a philosophical transaction for you!
royalsocietypublishing.org
We implemented an embedding algorithm (similar to t-SNE, PCA, UMAP etc) to work with genetic data. You can use it to rapdily visualise large collections and find clusters within them.
1/7
royalsocietypublishing.org
We implemented an embedding algorithm (similar to t-SNE, PCA, UMAP etc) to work with genetic data. You can use it to rapdily visualise large collections and find clusters within them.
1/7
We extended stochastic cluster embedding. This is neat because
a) the algorithm was designed using a human perception study to tune parameters to make the output look 'clustery'
b) it's solvable by SGD, so we can parallelise and make fast on both CPU and GPU
2/7
a) the algorithm was designed using a human perception study to tune parameters to make the output look 'clustery'
b) it's solvable by SGD, so we can parallelise and make fast on both CPU and GPU
2/7
We can process sequence alignments, assemblies (via sketching) and gene presence/absence
This let us run (in under a couple of hours) on
- 661k bacterial assemblies
- 1M SARS-CoV-2 genomes
You can make videos of the optimisation too
youtube.com
3/7
This let us run (in under a couple of hours) on
- 661k bacterial assemblies
- 1M SARS-CoV-2 genomes
You can make videos of the optimisation too
youtube.com
3/7
Our algorithm is called mandrake, you can find the code here: github.com
It's also available on conda, and as a web app you can run in your browser without any installation (using WebAssembly)
gtonkinhill.github.io
6/7
It's also available on conda, and as a web app you can run in your browser without any installation (using WebAssembly)
gtonkinhill.github.io
6/7
And of course, I made a stupid new logo and help page
mandrake.readthedocs.io
Thanks to @gerrythill for writing/analysing with me, Jukka Corander and Zhirong Yang for developing SCE
7/7
mandrake.readthedocs.io
Thanks to @gerrythill for writing/analysing with me, Jukka Corander and Zhirong Yang for developing SCE
7/7
Loading suggestions...