Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
586 views
in Technique[技术] by (71.8m points)

c# - Entity Framework Core Performance Optimization to ingest very large folder of the same table

I have a background service which in C# which ingest 3600 xml files (5Gb file size in total) to SQL Server database. The duration to complete the ingestion is around 16 hours. I use hangfire to create 3 jobs/threads and each job will have one folder to ingest, Folder A, B, C.

Problem is Folder C is very extra heavy. My idea is split the files in Folder C into two folders, folder C1 and folder C2. So now, I have 4 jobs/threads, Folder A, B, C1 and C2. But the problem is C1 and C2 job hit database error below which I believe due to they both assessing to the same table.

An exception occurred in the database while saving changes for context type 'xxxContext'. System.InvalidOperationException: A second operation started on this context before a previous operation completed. This is usually caused by different threads using the same instance of DbContext

and another time with this error:

An exception occurred in the database while saving changes for context type 'xxxContext'. System.InvalidOperationException: Collection was modified; enumeration operation may not execute.

and error from hangfire is below:

Hangfire.Storage.DistributedLockTimeoutException Timeout expired. The timeout elapsed prior to obtaining a distributed lock on the 'HangFire:IIngestService.IngestPersonXML' resource.

Hangfire.Storage.DistributedLockTimeoutException: Timeout expired. The timeout elapsed prior to obtaining a distributed lock on the 'HangFire:IIngestService.IngestPersonXML' resource.

When I use Parallel.ForEach, I also get this error:

System.InvalidOperationException: 'Operations that change non-concurrent collections must have exclusive access. A concurrent update was performed on this collection and corrupted its state. The collection's state is no longer correct.'

I only need to insert into db. No update or delete operation needed. Is there any workaround for this?

question from:https://stackoverflow.com/questions/65915661/entity-framework-core-performance-optimization-to-ingest-very-large-folder-of-th

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

EF in not meant for this sort of operations. Use SqlBulCopy for that.
There are libraries providing it seamlessly for EF, but you can write your own implementation aswell - it's not that complex

Really don't get this part

I only need to insert into db. Do update or delete operation needed. Is there any workaround for this?

So do you need to update or not? Well.. if you need to update a bunch of rows, insert them with bulk copy into temp table and then just do join update.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...