Circoletto @ the BAT cave

poster | news | help | code | cite us | contact us | BAT cave |

BLAST


- this will not work with uploaded BLAST output (see below)
- please be considerate with this, we have also set stricter limits and safeguards
- Circoletto never did that, by default



output
- we used to show only one, the default is now all

- colouring is relative, e.g. red may not represent universally best hits but the best of bad
- '(score-min)/(max-min)' should give more colour range esp. for % identity
- incompatible with the ratio above, and currently only allowed with % identity
- be careful, we don't check numbers and logic yet
- scored colours will still show in histograms
- scored colours will still show around ribbons and in histograms
- i.e. keep input order with queries first, database next

- we'll check they're DNA
- we'll check they're DNA
- may assist clarity, incompatible with ribbon untangling
- may assist clarity, incompatible with ribbon untangling
- ...then read anticlockwisely, may assist clarity with e.g. segment order
- ...then read anticlockwisely, may assist clarity with e.g. segment order
- faint green and red bands at ideogram edges, green-to-red is correct orientation

- in pixels
- switch off ribbon untangling for rainbows? scored colours will still show around ribbons and in histograms


or

news and updates

07.09.16 | minor bug in histogram data calculation with absolute scoring, which is only for visualisation performance - you probably didn't even see it
27.08.16 | % identity colouring can now be absolute, i.e. you can set max % identity for blue, green, and orange, with the rest coloured red
                  (thanks to Dr. Laura Sycuro, University of Calgary, Canada, for the discussion)
27.08.16 | there's a new (but non-default, yet) mode to assign colours to scores, '(score-min)/(max-min)', that should especially help % identity scoring
05.07.16 | non-[\w-.] label characters now replaced with '.' instead of '_' to avoid SVG issues, and best hit outline back to black for SVG
04.07.16 | (DNA) sequences can now be reverse complemented, best hits now outlined with same colour instead of black, tested with latest Circos (0.69-3),                   and GitHub!
19.12.15 | nothing specific really, just uploaded the latest offline package because there have been some edits since August (I really need to GitHub this...)
29.08.15 | bug with >20char labels (which we trim to 20) causing duplicated/disordered ideograms - code "cleaning"...
                  - also switched to a more condensed and lighter-coloured font
10.08.15 | now with greyscale rainbows as well, which might help more - otherwise remember the 7 rainbow colours
10.08.15 | bug with printing histogram data of inverted ribbons - overzealous code cleaning...
08.08.15 | ribbon colouring enhanced, 7-colour rainbow also available (new, so you'll be my lab rats - tip: switch off ribbon untangling)
                  - also switched to old Circos' blue/green/orange/red, brighter and more recognisable
04.08.15 | max total sequences set to 200 from 500, as per default Circos limit,
                  which is reasonable anyway but also to save me/you from modifying and distributing housekeeping.conf
                  - if you need more please download and edit both Circolleto (max_sequences) and Circos' housekeeping.conf (max_ideograms)
                  ...and apologies to the user who experienced this "bug" this morning, unfortunately I tend to do this on the fly
01.08.15 | Perl/HTML code update/cleanup and a much better downloadable package below, and more server resources with higher limits
                  - some new features are in my mind but we'll see, any requests are welcome in any case
01.11.13 | with colouring by database annotations and in very specific / unlikely scenarios, ribbons wouldn't show - still...
21.10.13 | caught bug with orientation lights for very short sequences (thank you StV1/Contigs user)
19.10.13 | it's been a year, hope you are all well :-)
                  - for the increasing number of Circoletto users, we have a bug fix and plenty of new and hopefully useful features...
                  - for a few of them, and a fruitfull discussion, I'd specifically like to thank Prof Dahlia Nielsen from North Carolina State University, US
                  - the bug involved the non-twisting of inverted (reverse complementary) nucleotide sequence ribbons with user-provided BLAST output
                      (i.e. the ribbons should had been twisted but they were not)
                  - the new features include control over which colours of ribbons to show, the depth-ordering of ribbons, the orientation of ideograms
                      (and whether you read them anti/clockwisely), and the orientation lights for reading sequences in the right orientation i.e. green to red
                  - we also improved or introduced text here and there, including in the results page
                  - per usual, there could be new bugs and issues with Circoletto, please be patient and please send us feedback
15.10.12 | not all ribbons were drawn when >500 (?) of them - don't know when this one sneaked in but it's getting embarrassing, isn't it? :-)
13.10.12 | ribbons were twisting even without sequence inversion, which has now been corrected - most probably another bug since the Circos upgrade
16.09.12 | BLAST output handling corrected
06.09.12 | corrected month-old bug in loading annotation file with sequence labels containing '-' and '.'
22.08.12 | please take care to use accepted colour names in the optional annotation file (I'll implement an internal check soon)
17.08.12 | still ironing out issues introduced with switching to the latest Circos, sincere apologies for all the failed runs
06.08.12 | compatibility with Circos 0.62-1 (applicable to offline Circoletto, thanks to Till Bayer for the heads-up), and a minor tweak
06.06.12 | inspired by user attempts to e.g. map reads to reference, ideogram order can now be reversed (incompatible with ribbon untangling)
                  - I might try and rainbow-colour ideograms and/or ribbons based on order
06.06.12 | been a long time :-) input sequence order is now maintained if ribbon untangling does not run (either by user choice or Circoletto limitations)
20.12.11 | been having problems with server overload, so tBLASTx is now under tighter control
                  - in general, please be considerate (until we upgrade our server :-))
04.08.11 | ribbons can now be coloured by invertion, normal in black, inverted in lime (thanks to Yannick)
26.07.11 | slightly delayed release of a minor update, but which also now includes a corrected Circos file (thanks to Colin)
25.05.11 | long-overdue update of background and instructions
04.05.11 | well, as always, further improvements follow a release, incl. a foolish BLAST bug, and safer (albeit 'blind') sequence handling
                  - i.e. the same label more than once (irrespective of sequence) will become a separate entry, so be careful yesterday's user @ 5pmGMT!
03.05.11 | inspired by your runs and our work, new features have been implemented, and some bugs fixed
08.03.11 | the server now monitors the resources used and terminates greedy runs - bear with us while we finetune this, and let us know of any problems
10.02.11 | changes in the interface and in the algorithm, plus a poster
28.10.10 | an issue when some sequences were both queries and database entries (quite uncommon) was resolved
19.10.10 | bugs were fixed for BLAST-output runs (thanks to CH for the report), plus now you can switch off sequence labels
15.10.10 | self-hits of equal/less than 50% of sequence are shown in all-vs-all, very thin ribbons are drawn without borders for colour clarity, bug fixes
03.09.10 | further fine-tuning, incl. better sequence type handling, and flexible ribbon number management - PLEASE provide feedback
02.09.10 | new colouring options (E-value, % identity), improved annotation file and fixed process, bug fixes incl. rarely few ribbons not being shown
01.09.10 | upload limit now up to 10MB, plus histograms might be limited to 'warm' colours or disabled altogether if the data is too much
30.08.10 | we had some downtime these couple of days, apologies... related to this, we have made some changes to improve both speed and stability
                  - of importance, sequence names are now handled better to fit in 20 characters (esp. so for uploaded BLAST output)
18.08.10 | Circoletto has now been accepted in Bioinformatics, i.e. you need to thank the anonymous reviewers for improvements in the past few days
                  - and of course much motivation and feedback was provided from the Team and other young and promising researchers at the Institute
12.08.10 | a contact form has now been added near the top of the page;
                  PLEASE use it to easily report bugs, send feedback etc., it will be greatly appreciated
12.08.10 | please note that the BLAST output must be in pairwise alignment format and in plain text (example)
09.08.10 | you can now upload BLAST output in the usual/default pairwise format instead of FASTA files
23.07.10 | another bug due to certain characters in sequence names has been dealt with by only allowing alphanumerics, undescores and vertical bars
                  - also, max name length is now 20 characters
20.07.10 | a small (and apparently infrequent) bug in the hit histograms has been corrected (hopefully)
18.07.10 | output size can now be set, and an optional annotation file can be loaded onto the ideograms
15.07.10 | by default, ideograms are now kept in the order the user has them, queries first, database next
                  - just make sure you have switched ribbon untangling off


code

Now on GitHub.


background and instructions

| intro
Circoletto is based on Circos by the incomparable Dr. Martin Krzywinski. It is currently built to visualise BLAST sequence comparison results, currently also providing a small-scale BLAST server. In fact, Circoletto works best with small datasets and a few hundred links at most, so we are actively controlling the process: Circoletto will only allow up to 1000 ribbons (i.e. local alignments).

| input formats and options
You can provide EITHER two FASTA-formatted nucleotide or amino acid sequence files, the query and database (the same file(name) twice translates to an all against all run), OR a precomputed pairwise alignment BLAST output in plain text. For the BLAST run, an E-value can be selected from presets, you can choose to run tBLASTx for DNA vs DNA sequences, and you can choose to pre-filter the sequences for low complexity.

| output options
For the output, you can control:
  whether you only want to show the best hit per query, and if so all the alignments of the best hit or the single best alignment of the best hit,
  whether you only want to show the sequences that produced hits,
  the ribbon colouring scheme,
  the ribbon untangling,
  the sequence/ideogram order and orientation,
  the use of the sequence labels,
  and finally you can select between PNG and SVG - for a quick look PNG is more than fine, but an SVG file, although slower, looks much better.

| results
Hitting the 'submit to Circolleto' button will reload the page, providing feedback before serving links to the visualisation and back to this form. On the output, everything is read clockwise. The ideograms of the queries are light grey and protruding compared to the dark grey of the database. You can control the colours of the ideograms or parts of them using an annotation file (example). If you do that, then you have the option to colour the ribbons with the colour of the database (!) ideograms or their domains as set in the aforementioned annotation file. For example, this could help you to follow links between certain sequences or parts of - see default and this.

Inside the circle, ribbons represent the local alignments BLAST has produced (or you have provided), in four semi-transparent colours, blue, green, orange and red, representing the four quartiles up to the maximum score - i.e. a local alignment with a score of 80% of the maximum score is red, while one with 20% of the maximum score is blue. Be careful though, this is all relative, i.e. red does not mean universally good (or best), it could mean best of bad alignments - and the same with blue for example. Also, the bitscore (which is used by default for colouring) correlates heavily with alignment length, meaning that you might see (narrower) blue and (wider) red ribbons which might actually all describe alignments with 100% identity. You can easily see this if you run the same dataset with different choice of colouring scheme.

Ribbons representing best hits are outlined and placed on top of all other ribbons, otherwise, wider ribbons will be placed beneath narrower ribbons for clarity. More often than not, ribbons will overlap and produce a rather complex picture (with complex colours) - to assist with decoding this, we include a histogram on top of the ideograms, counting how many times each colour has hit the specific part of the sequence. Also, a ribbon will invert if the local alignment is inverted, i.e. if the query hit the other strand of a database sequence.

hosted at the Bioinformatics Analysis Team / BAT