Load the File
Today we’re going to talk about how to generate a Geneious variant call report file. This is a question that we get frequently from clients who are using Geneious and they want to know how we generate our standard variant report file. We’re going to show you how to do that today.
We’re going to assume that you already have Geneious, you already have Geneious Prime launched and running, that you’ve already imported your sample FastQ files, you’ve already run the sequence alignment variant call algorithms against those FastQ files, and now you want to know how to generate the variant report. So that’s where we’re going to start.
In this case, we already have Geneious launched. If you look, there should be a contig file, it’ll have a three diagonal bar icon. You’ll have your sample name in here and it will say that it is assembled to some reference genome. In this case its Staph. aureus. So it’d be “sample_name_assembled_to_reference_genome”.
Set Up the Geneious Panels
The first thing you want to do is check that record. This is the document field or document panel in Geneious up here. That will bring up three other panels, or at least it should. This may depend on where you’re at with Geneious. If you’ve already run Geneious a few times it may look a little bit different than this. It depends on how you save and configure things, but you should be able to bring up this sequence view panel in the middle. On the right, there will be this configuration panel. Then down on the bottom, there will be a spreadsheet panel.
The first thing you want to do is make sure that you look at this tab list along the top here and click on “contig view”. You want to make sure that’s highlighted. The next step is to go over to the configuration panel on the right, and then you’ll see this list of options along the right panel. The one you want to choose is this yellow right arrow. Click the “Annotation and Tracks” panel and that’ll bring up this view.
Then what you want to do is make sure that “Show Annotations” is checked. You want to make sure that “Types” is checked. Make sure that “CDS” is checked, and then you want to make sure that the remaining fields are unchecked. I’m going to uncheck all of these other options here. Then the bottom when I make sure that “Tracks” is checked. You should also see a “Variants” track and make sure that’s checked. So this is what it should look like. This is just our default mode, by the way – Show Annotations, Types, CDS, Tracks, and Variants.
Then if you look along the bottom panel, it’ll say “Columns”, “Track”, and “Export Table”. We want to go to the “Track” option, left mouse click that to highlight it, and there should be a “Variants” track in here. We want to check that. That’s going to bring up another spreadsheet. I’ve already configured this, so I have some columns already set here, but you should see some kind of a spreadsheet down at the bottom at this point.
Changing the Variant Report Headers
The next step is to go to the “Columns” option and click on that. It will bring up a fairly long list of column names or column headers that you can choose to configure the spreadsheet. You can see some of these are already checked and I’ll show you how to do that in just a second. You can see there’s a long list here. You have “Name”, “Min” and “Max”, “Genomic Loci”, you can have things like the “Length”, “Amino Acid Change”, “Quality”, “CDS”, “Codeon Changes, and so on and so forth. You have all these different options.
What you want to do is go to the top, click “Manage Columns”, and that will bring up this view. The list on the left shows all the available column headers that you can choose from. The list on the right shows the ones we’ve already selected. If you want, you can pick any available column, choose the right arrow, and that’ll move it over to the selected column. So here it is over on the selected side. If you want to rearrange the order of the spreadsheet headers, you can click the up button and the down button to move it around, whatever you want to do there.
I already have these in order. I don’t really want “Quality” in this report, so I’m going to move it back out to the available column with the left arrow. What I’m left with is the selected list showing all the column headers that I want in this spreadsheet. So then I click “Ok” and now what’ll happen is in the bottom panel you’ll see those headers show up here. You can see “Name”, “Minimum”, “Maximum”, “Loci”, the length of the polymorphism, “Polymorphism Type”, “Amino Acid Change”, coding sequence and so forth.
Export the Variant CSV Report
At this point you can export the table. Just click on the “Export Table” option and that’ll bring up a dialog box. You can go to this “Look in:” field and choose whatever directory want to store in the file in. Down here we have the file name. There’s a default naming convention. You can either leave that as it is or change it. You’ll see that there’s a couple of file types. One is the Excel .csv file, you can also use tab-separated values or .tsv file if you want. We always just choose the default, the the standard .csv file, and then we save it. That’s really all there is to creating the variant spreadsheet file.
Change to an Excel File
At this point we have the .csv file saved on our local workstation. You can also format that in a different Excel format so I’ll show you how to do that right now.
I went ahead and closed out Geneious and launched Excel. I’m going to assume that you have some version of Excel installed. One reason I want to walk through this is that most people wanted to convert the .csv file to an Excel formatted file that they can view and work with. I’ll show you how we do that here, it’s pretty straightforward, but there’s a couple of nice little tricks I can show you.
Choose File – Open, look for the .csv file, and open it. The formatting is not great here, so we’re going to reformat this a little bit. One thing I want to do first is save it in a different format. So we chose File – Save As, and it brings up this dialog box. This might be a bit different on your system depending on what operating system you’re using or what version of Excel, but somewhere it should say something like “File Format”, and by default it’s .csv. What we’re going to do is change that to .xls, it’s a standard Microsoft Excel format and it’ll say something like “Microsoft Excel 5.0/95 Workbook” with the .xls suffix. Click on that and then save it. So not a file saved in an excel format, an .xls format.
Formatting the Excel File
There are a couple of nice things you can do here. We can highlight the header record and convert it to a bold format. One thing I like to do here is go to View – Freeze the Top Row. If we scroll up and down the top row was frozen.
Then Edit – Select All, we want to select the entire spreadsheet in this case. Then we can go up to Format – Columns – Auto Fit Selection, and what “Auto Fit” does is look at all the cells and tries to reformat them so that every cell has served this optimal formatting. It just lays it out very nicely for you.
Go up to Data – Sort, which brings up this dialog box. The first thing I want to do is make sure I have “My List has Headers” is checked because they do have headers. Go to the first field here and I normally choose “Minimum”, the minimum genomic loci. I click on that, and then we use “Smallest to Largest” in this case. Then you go to the (+) option to add another level. Go back up here again and in this case we’re going to sort by the “Maximum” genomic loci, then “Smallest to Largest” again, and click “Ok”.
It has now sorted this across the genome from left to right, from smallest to largest genomic locus. If we do a little scrolling here, you’ll see that the it’s increasing across the genome, left to right, smallest to largest. That’s kind of nice, because then we can just go scrolling through here and look at different polymorphism types. You can see what kind of coding sequences there are, if there are any codeon changes, protein effects, and all that.
Don’t forget to save that, go to File – Save. Now you have yourself the variant report.