int CueSheetModel::parseCueFile(QFile &cueFile, const QDir &baseDir, QCoreApplication *application, const QTextCodec *codec) { cueFile.reset(); qDebug("\n[Cue Sheet Import]"); bool bForceLatin1 = false; //Reject very large files, as parsing might take until forever if(cueFile.size() >= 10485760i64) { qWarning("File is very big. Probably not a Cue Sheet. Rejecting..."); return 2; } //Test selected Codepage for decoding errors qDebug("Character encoding is: %s.", codec->name().constData()); const QString replacementSymbol = QString(QChar(QChar::ReplacementCharacter)); QByteArray testData = cueFile.peek(1048576); if((!testData.isEmpty()) && codec->toUnicode(testData.constData(), testData.size()).contains(replacementSymbol)) { qWarning("Decoding error using selected codepage (%s). Enforcing Latin-1.", codec->name().constData()); bForceLatin1 = true; } testData.clear(); //Init text stream QTextStream cueStream(&cueFile); cueStream.setAutoDetectUnicode(false); cueStream.setCodec(bForceLatin1 ? "latin1" : codec->name()); cueStream.seek(0i64); //Create regular expressions QRegExp rxFile("^FILE\\s+(\"[^\"]+\"|\\S+)\\s+(\\w+)$", Qt::CaseInsensitive); QRegExp rxTrack("^TRACK\\s+(\\d+)\\s(\\w+)$", Qt::CaseInsensitive); QRegExp rxIndex("^INDEX\\s+(\\d+)\\s+([0-9:]+)$", Qt::CaseInsensitive); QRegExp rxTitle("^TITLE\\s+(\"[^\"]+\"|\\S+)$", Qt::CaseInsensitive); QRegExp rxPerformer("^PERFORMER\\s+(\"[^\"]+\"|\\S+)$", Qt::CaseInsensitive); QRegExp rxGenre("^REM\\s+GENRE\\s+(\"[^\"]+\"|\\S+)$", Qt::CaseInsensitive); QRegExp rxYear("^REM\\s+DATE\\s+(\\d+)$", Qt::CaseInsensitive); bool bPreamble = true; bool bUnsupportedTrack = false; CueSheetFile *currentFile = NULL; CueSheetTrack *currentTrack = NULL; m_albumTitle.clear(); m_albumPerformer.clear(); m_albumGenre.clear(); m_albumYear = 0; //Loop over the Cue Sheet until all lines were processed for(int lines = 0; lines < INT_MAX; lines++) { if(application) { application->processEvents(); if(lines < 128) Sleep(10); } if(cueStream.atEnd()) { qDebug("End of Cue Sheet file."); break; } QString line = cueStream.readLine().trimmed(); /* --- FILE --- */ if(rxFile.indexIn(line) >= 0) { qDebug("%03d File: <%s> <%s>", lines, rxFile.cap(1).toUtf8().constData(), rxFile.cap(2).toUtf8().constData()); if(currentFile) { if(currentTrack) { if(currentTrack->isValid()) { currentFile->addTrack(currentTrack); currentTrack = NULL; } else { LAMEXP_DELETE(currentTrack); } } if(currentFile->isValid()) { m_files.append(currentFile); currentFile = NULL; } else { LAMEXP_DELETE(currentFile); } } else { LAMEXP_DELETE(currentTrack); } if(!rxFile.cap(2).compare("WAVE", Qt::CaseInsensitive) || !rxFile.cap(2).compare("MP3", Qt::CaseInsensitive) || !rxFile.cap(2).compare("AIFF", Qt::CaseInsensitive)) { currentFile = new CueSheetFile(baseDir.absoluteFilePath(UNQUOTE(rxFile.cap(1)))); qDebug("%03d File path: <%s>", lines, currentFile->fileName().toUtf8().constData()); } else { bUnsupportedTrack = true; qWarning("%03d Skipping unsupported file of type '%s'.", lines, rxFile.cap(2).toUtf8().constData()); currentFile = NULL; } bPreamble = false; currentTrack = NULL; continue; } /* --- TRACK --- */ if(rxTrack.indexIn(line) >= 0) { if(currentFile) { qDebug("%03d Track: <%s> <%s>", lines, rxTrack.cap(1).toUtf8().constData(), rxTrack.cap(2).toUtf8().constData()); if(currentTrack) { if(currentTrack->isValid()) { currentFile->addTrack(currentTrack); currentTrack = NULL; } else { LAMEXP_DELETE(currentTrack); } } if(!rxTrack.cap(2).compare("AUDIO", Qt::CaseInsensitive)) { currentTrack = new CueSheetTrack(currentFile, rxTrack.cap(1).toInt()); } else { bUnsupportedTrack = true; qWarning("%03d Skipping unsupported track of type '%s'.", lines, rxTrack.cap(2).toUtf8().constData()); currentTrack = NULL; } } else { LAMEXP_DELETE(currentTrack); } bPreamble = false; continue; } /* --- INDEX --- */ if(rxIndex.indexIn(line) >= 0) { if(currentFile && currentTrack) { qDebug("%03d Index: <%s> <%s>", lines, rxIndex.cap(1).toUtf8().constData(), rxIndex.cap(2).toUtf8().constData()); if(rxIndex.cap(1).toInt() == 1) { currentTrack->setStartIndex(parseTimeIndex(rxIndex.cap(2))); } } continue; } /* --- TITLE --- */ if(rxTitle.indexIn(line) >= 0) { if(bPreamble) { m_albumTitle = UNQUOTE(rxTitle.cap(1)).simplified(); } else if(currentFile && currentTrack) { qDebug("%03d Title: <%s>", lines, rxTitle.cap(1).toUtf8().constData()); currentTrack->setTitle(UNQUOTE(rxTitle.cap(1)).simplified()); } continue; } /* --- PERFORMER --- */ if(rxPerformer.indexIn(line) >= 0) { if(bPreamble) { m_albumPerformer = UNQUOTE(rxPerformer.cap(1)).simplified(); } else if(currentFile && currentTrack) { qDebug("%03d Title: <%s>", lines, rxPerformer.cap(1).toUtf8().constData()); currentTrack->setPerformer(UNQUOTE(rxPerformer.cap(1)).simplified()); } continue; } /* --- GENRE --- */ if(rxGenre.indexIn(line) >= 0) { if(bPreamble) { QString temp = UNQUOTE(rxGenre.cap(1)).simplified(); for(int i = 0; g_lamexp_generes[i]; i++) { if(temp.compare(g_lamexp_generes[i], Qt::CaseInsensitive) == 0) { m_albumGenre = QString(g_lamexp_generes[i]); break; } } } else if(currentFile && currentTrack) { qDebug("%03d Genre: <%s>", lines, rxGenre.cap(1).toUtf8().constData()); QString temp = UNQUOTE(rxGenre.cap(1).simplified()); for(int i = 0; g_lamexp_generes[i]; i++) { if(temp.compare(g_lamexp_generes[i], Qt::CaseInsensitive) == 0) { currentTrack->setGenre(QString(g_lamexp_generes[i])); break; } } } continue; } /* --- YEAR --- */ if(rxYear.indexIn(line) >= 0) { if(bPreamble) { bool ok = false; unsigned int temp = rxYear.cap(1).toUInt(&ok); if(ok) m_albumYear = temp; } else if(currentFile && currentTrack) { qDebug("%03d Year: <%s>", lines, rxPerformer.cap(1).toUtf8().constData()); bool ok = false; unsigned int temp = rxYear.cap(1).toUInt(&ok); if(ok) currentTrack->setYear(temp); } continue; } } //Append the very last track/file that is still pending if(currentFile) { if(currentTrack) { if(currentTrack->isValid()) { currentFile->addTrack(currentTrack); currentTrack = NULL; } else { LAMEXP_DELETE(currentTrack); } } if(currentFile->isValid()) { m_files.append(currentFile); currentFile = NULL; } else { LAMEXP_DELETE(currentFile); } } //Finally calculate duration of each track int nFiles = m_files.count(); for(int i = 0; i < nFiles; i++) { if(application) { application->processEvents(); Sleep(10); } CueSheetFile *currentFile = m_files.at(i); int nTracks = currentFile->trackCount(); if(nTracks > 1) { for(int j = 1; j < nTracks; j++) { CueSheetTrack *currentTrack = currentFile->track(j); CueSheetTrack *previousTrack = currentFile->track(j-1); double duration = currentTrack->startIndex() - previousTrack->startIndex(); previousTrack->setDuration(qMax(0.0, duration)); } } } //Sanity check of track numbers if(nFiles > 0) { bool hasTracks = false; int previousTrackNo = -1; bool trackNo[100]; for(int i = 0; i < 100; i++) { trackNo[i] = false; } for(int i = 0; i < nFiles; i++) { if(application) { application->processEvents(); Sleep(10); } CueSheetFile *currentFile = m_files.at(i); int nTracks = currentFile->trackCount(); if(nTracks > 1) { for(int j = 0; j < nTracks; j++) { int currentTrackNo = currentFile->track(j)->trackNo(); if(currentTrackNo > 99) { qWarning("Track #%02d is invalid (maximum is 99), Cue Sheet is inconsistent!", currentTrackNo); return ErrorInconsistent; } if(currentTrackNo <= previousTrackNo) { qWarning("Non-increasing track numbers (%02d -> %02d), Cue Sheet is inconsistent!", previousTrackNo, currentTrackNo); return ErrorInconsistent; } if(trackNo[currentTrackNo]) { qWarning("Track #%02d exists multiple times, Cue Sheet is inconsistent!", currentTrackNo); return ErrorInconsistent; } trackNo[currentTrackNo] = true; previousTrackNo = currentTrackNo; hasTracks = true; } } } if(!hasTracks) { qWarning("Could not find at least one valid track in the Cue Sheet!"); return ErrorInconsistent; } return ErrorSuccess; } else { qWarning("Could not find at least one valid input file in the Cue Sheet!"); return bUnsupportedTrack ? ErrorUnsupported : ErrorBadFile; } }
ParsingStatus ParserArchiveFoolzUs::parseHTML(QString html) { QStringList res; QRegExp rxImages("<div class=\"thread_image_box\"[^>]*>[^<]*<a href=\"([^\"]+)\"(?:[^<]+)(<[^>]*>)[^<]*</a>", Qt::CaseInsensitive, QRegExp::RegExp2); QRegExp rxThreads("<a href=\"([^\"]+)\"[^>]*>View</a>", Qt::CaseSensitive, QRegExp::RegExp2); QRegExp rxTitle("<span class=\"subject\">([^<]+)</span>"); //bool imagesAdded; bool pageIsFrontpage; int pos; _IMAGE i; QUrl u; QString sUrl; _html = html; _images.clear(); _redirect.clear(); _urlList.clear(); _statusCode.hasErrors = false; _statusCode.hasImages = false; _statusCode.hasTitle = false; _statusCode.isFrontpage = false; pos = 0; i.downloaded = false; i.requested = false; // pageIsFrontpage = !html.contains("<div id=\"ca_thread_html\">"); pageIsFrontpage = html.count("</aside>") > 1 ? true:false; if (pageIsFrontpage) { pos = 0; _statusCode.isFrontpage = true; while (pos > -1) { pos = rxThreads.indexIn(html, pos + 1); res = rxThreads.capturedTexts(); if (!res.at(1).isEmpty()) { sUrl = res.at(1); if (sUrl.endsWith("/")) { sUrl.remove(sUrl.length()-1,1); } _urlList.append(QUrl(sUrl)); } } } else { // Checking for Images pos = 0; while (pos > -1) { pos = rxImages.indexIn(html, pos+1); res = rxImages.capturedTexts(); i.originalFilename = res.at(1).right(res.at(1).length() - res.at(1).lastIndexOf("/") - 1); i.largeURI = res.at(1); i.thumbURI = ""; if (pos != -1) { _images.append(i); _statusCode.hasImages = true; } } pos = 0; while (pos > -1) { pos = rxTitle.indexIn(html,pos+1); res = rxTitle.capturedTexts(); if (res.at(1) != "") { _threadTitle = res.at(1); _statusCode.hasTitle = true; pos = -1; } } } return _statusCode; }