Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
225 views
in Technique[技术] by (71.8m points)

c++ - How can I asynchronously load data from large files in Qt?

I'm using Qt 5.2.1 to implement a program that reads in data from a file (could be a few bytes to a few GB) and visualises that data in a way that's dependent on every byte. My example here is a hex viewer.

One object does the reading, and emits a signal dataRead() when it's read a new block of data. The signal carries a pointer to a QByteArray like so:

filereader.cpp

void FileReader::startReading()
{

    /* Object state code here... */

        {
            QFile inFile(fileName);

            if (!inFile.open(QIODevice::ReadOnly))
            {
                changeState(STARTED, State(ERROR, QString()));
                return;
            }

            while(!inFile.atEnd())
            {
                QByteArray *qa = new QByteArray(inFile.read(DATA_SIZE));
                qDebug() << "emitting dataRead()";
                emit dataRead(qa);
            }
        }

    /* Emit EOF signal */

}

The viewer has its loadData slot connected to this signal, and this is the function that displays the data:

hexviewer.cpp

void HexViewer::loadData(QByteArray *data)
{
    QString hexString = data->toHex();

    for (int i = 0; i < hexString.length(); i+=2)
    {
        _ui->hexTextView->insertPlainText(hexString.at(i));
        _ui->hexTextView->insertPlainText(hexString.at(i+1));
        _ui->hexTextView->insertPlainText(" ");
    }

    delete data;
}

The first problem is that if this is just run as-is, the GUI thread will become completely unresponsive. All of the dataRead() signals will be emitted before the GUI is ever redrawn.

(The full code can be run, and when you use a file bigger than about 1kB, you will see this behaviour.)

Going by the response to my forum post Non-blocking local file IO in Qt5 and the answer to another Stack Overflow question How to do async file io in qt?, the answer is: use threads. But neither of these answers go into any detail as to how to shuffle the data itself around, nor how to avoid common errors and pitfalls.

If the data was small (of the order of a hundred bytes) I'd just emit it with the signal. But in the case the file is GB in size (edit) or if the file is on a network-based filesystem eg. NFS, Samba share, I don't want the UI to lock up just because reading the file blocks.

The second problem is that the mechanics of using new in the emitter and delete in the receiver seems a bit naive: I'm effectively using the entire heap as a cross-thread queue.

Question 1: Does Qt have a better/idiomatic way to move data across threads while limiting memory consumption? Does it have a thread safe queue or other structures that can simplify this whole thing?

Question 2: Does I have to implement the threading etc. myself? I'm not a huge fan of reinventing wheels, especially regarding memory management and threading. Are there higher level constructs that can already do this, like there are for network transport?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

First of all, you don't have any multithreading in your app at all. Your FileReader class is a subclass of QThread, but it does not mean that all FileReader methods will be executed in another thread. In fact, all your operations are performed in the main (GUI) thread.

FileReader should be a QObject and not a QThread subclass. Then you create a basic QThread object and move your worker (reader) to it using QObject::moveToThread. You can read about this technique here.

Make sure you have registered FileReader::State type using qRegisterMetaType. This is necessary for Qt signal-slot connections to work across different threads.

An example:

HexViewer::HexViewer(QWidget *parent) :
    QMainWindow(parent),
    _ui(new Ui::HexViewer),
    _fileReader(new FileReader())
{
    qRegisterMetaType<FileReader::State>("FileReader::State");

    QThread *readerThread = new QThread(this);
    readerThread->setObjectName("ReaderThread");
    connect(readerThread, SIGNAL(finished()),
            _fileReader, SLOT(deleteLater()));
    _fileReader->moveToThread(readerThread);
    readerThread->start();

    _ui->setupUi(this);

    ...
}

void HexViewer::on_quitButton_clicked()
{
    _fileReader->thread()->quit();
    _fileReader->thread()->wait();

    qApp->quit();
}

Also it is not necessary to allocate data on the heap here:

while(!inFile.atEnd())
{
    QByteArray *qa = new QByteArray(inFile.read(DATA_SIZE));
    qDebug() << "emitting dataRead()";
    emit dataRead(qa);
}

QByteArray uses implicit sharing. It means that its contents are not copied again and again when you pass a QByteArray object across functions in a read-only mode.

Change the code above to this and forget about manual memory management:

while(!inFile.atEnd())
{
    QByteArray qa = inFile.read(DATA_SIZE);
    qDebug() << "emitting dataRead()";
    emit dataRead(qa);
}

But anyway, the main problem is not with multithreading. The problem is that QTextEdit::insertPlainText operation is not cheap, especially when you have a huge amount of data. FileReader reads file data pretty quickly and then floods your widget with new portions of data to display.

It must be noted that you have a very ineffectual implementation of HexViewer::loadData. You insert text data char by char which makes QTextEdit constantly redraw its contents and freezes the GUI.

You should prepare the resulting hex string first (note that data parameter is not a pointer anymore):

void HexViewer::loadData(QByteArray data)
{
    QString tmp = data.toHex();

    QString hexString;
    hexString.reserve(tmp.size() * 1.5);

    const int hexLen = 2;

    for (int i = 0; i < tmp.size(); i += hexLen)
    {
        hexString.append(tmp.mid(i, hexLen) + " ");
    }

    _ui->hexTextView->insertPlainText(hexString);
}

Anyway, the bottleneck of your application is not file reading but QTextEdit updating. Loading data by chunks and then appending it to the widget using QTextEdit::insertPlainText will not speed up anything. For files less than 1Mb it is faster to read the whole file at once and then set the resulting text to the widget in a single step.

I suppose you can't easily display huge texts larger than several megabytes using default Qt widgets. This task requires some non-trivial approch that in general has nothing to do with multithreading or asynchronous data loading. It's all about creating some tricky widget which won't try to display its huge contents at once.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...