Troubleshooting¶
Standard error (stderr) and standard out (stdout) are directed to the
directory /home/ubuntu/logs
. Errors will appear in the stderr
file for
the corresponding datasets. There may also be useful errors in the
stderr_master.log
and stdout_master.log
files that are created
in the directory from which you ran the python Master.py
call. In
addition, file outputs from each step of the pipeline can be found in
/home/ubuntu/proc/datasetID_proc_16S
, which can help to diagnose
errors. Common sources of errors or difficulties include:
Bad formatting of the summary, barcodes, or primers file. There should be no white spaces, and columns should be tab-delimited. If you created the file in Excel, it may have carriage return or other non-linux formatting characters that introduce difficulties. Safest is to go from a previous summary file template, or to create it manually.
Sometimes emacs or other editors will insert multiple tab characters where you only pressed tab once. Make sure that all white spaces correspond to one tab character only.
Moving text files between OS’s can also lead to errors. For example, the Mac newline character is a weird
^M
carriage return. Make sure that any of your files replace these with normal newline (\n
) characters. One way to check this is toless
your file (e.g.$ less <your_file>
).Typos in a 16S attribute key-value pair in the summary file.
Incorrect ASCII encoding specified. Check whether 33 or 64 using
usearch8 -fastq_chars yourFASTQfile.fastq
. Default is 33, so if left unspecified and the file is ASCII base 64, quality trimming will fail.For ITS sequences, the RDP classifier often runs out of RAM when loading the UNITE database. This is usually what happened if you find
RDP_classification.txt
to be empty in the relevant folder in/home/ubuntu/proc
. If you rerun the pipeline several times, have checked everything else and keep encountering this problem, you may need a node with more RAM to do your processing.If you specify a file containing a list of FASTQ/FASTA files for the raw reads, make sure the file paths are relative to the directory hosting the summary file.