I want to have a better logic for getting the splitter (markers) when blank page is detected and remove blanks pages and return it to UI showing the splitters.
Scenario 1 :
- User upload 1 file (5) page => Page 1 - blank, Page 2 - not blank, Page 3 - Blank, Page 4 - not blank, Page 5 - blank
- Detect blank pages and create splitters.
- Remove blank pages (P1, P3, P5)
- create new file which will have only not blank pages. (P2, P4)
- Return the markers to UI
- Final result for splitters should be : 0, 1
Scenario 2 :
- User upload 1 file (9) page => P1 P2 - blank, P3 P4 not blank, P5 P6 - Blank, P7 - not blank, P8 P9 - blank
- Detect blank pages and create splitters.
- Remove blank pages (P1, P2, P5, P7, P8, P9)
- create new file which will have only not blank pages. (P3, P4, P7)
- Return the markers to UI
- Final result for splitters should be : 0, 2
PS: 0 in the splitters will always be there as it indicates the index where new file starts.
this is what I have for now and its working fine BUT i would like to have a better logic for the same.
public List<int> GetMarkers(string mergedFile, SplitAndMergeFilesRequestDto optionsDto)
{
//improvements - remove invalid markers if higher than page count
//improvements - deal with blank pages only (1,2,3 blank pages)
//improvements - write tests
var markersToReturn = new List<int>();
var frameCount = RegisteredDecoders.GetImageInfo(mergedFile).FrameCount;
for (int i = 0; i < frameCount; i++)
{
bool blankPageDetected = false;
string tempFileName = GetFullPathForMergedFile();
for (var frameIndex = 0; frameIndex < RegisteredDecoders.GetImageInfo(mergedFile).FrameCount; frameIndex++)
{
using (Stream readStream = new FileStream(mergedFile, FileMode.Open, FileAccess.Read))
{
using (var img = new AtalaImage(readStream, frameIndex, null))
{
Log.Information($"Checking for BlankPage. FrameIndex: {frameIndex}");
if (IsBlankPage(img))
{
readStream.Seek(0, SeekOrigin.Begin);
var tiffDocument = new TiffDocument(readStream);
tiffDocument.Pages.RemoveAt(frameIndex);
tiffDocument.Save(tempFileName);
if (!markersToReturn.Contains(frameIndex))
{
markersToReturn.Add(frameIndex);
}
blankPageDetected = true;
Log.Information($"BlankPage found at FrameIndex: {frameIndex}");
break;
}
}
}
}
if (blankPageDetected)
{
File.Copy(tempFileName, mergedFile, true);
File.Delete(tempFileName);
}
else
{
break;
}
}
Log.ForContext<BlankPageStrategyCalculateMarkersService>()
.Information("SplitAndMerge - Created SplitMarkers: {@SplitMarkers}", markersToReturn);
if (!markersToReturn.Contains(0))
{
markersToReturn.Insert(0, 0);
}
return markersToReturn;
}
question from:
https://stackoverflow.com/questions/65517007/c-sharp-logic-for-splitting-the-pdf-tiff-file-when-there-is-a-blank-page-det 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…