Fastest Way to Get All RVF Text

General TRichView support forum. Please post your questions here
Post Reply
standay
Posts: 274
Joined: Fri Jun 18, 2021 3:07 pm

Fastest Way to Get All RVF Text

Post by standay »

Sergey,

What is the fastest way to get all the plain text from an RVF file? I assume I'd have to load the file into an rve or rv first.

I found rve.GetTextBuf() and RVGetTextRange(). If rve.GetTextBuf() is the best way, do you have example code on how to use that? RVGetTextRange is no problem if that's what I should use.

Thanks

Stan
Sergey Tkachenko
Site Admin
Posts: 17559
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Re: Fastest Way to Get All RVF Text

Post by Sergey Tkachenko »

1) Yes, you need to load RVF file to TRichView (richview.LoadRVF). If you do not want to create a visual component, you can load it in TRVReportHelper (rvreporthelper.RichView.LoadRVF).
If you do not need to display this document, it's not necessary to format it (i.e. it's not necessary to call richview->Format / rvreporthelper->Init). So the slowest operation (formatting) can be skipped.

2) There are several ways to store document as text.

If this text is for displaying to a human, use GetAllText(richview) from RVGetTextW
(this function returns Unicode string; GetAllText from RVGetText is an ANSI analog).
It produces the result that is very similar to richview.SaveTextToStreamW.

The alternative way is RVGetTextRange(richview, 0, RVGetTextLength(richview))
This method returns a text string that has one-to-one correspondence to the original document (i.e., when you know the character position in document as (RVData, ItemNo, OffsetInItem), you can calculate the corresponding character position in this string using RichViewToLinear, and vice versa using LinearToRichView).
standay
Posts: 274
Joined: Fri Jun 18, 2021 3:07 pm

Re: Fastest Way to Get All RVF Text

Post by standay »

Sergey,

Thanks for the ideas. I have both working, not sure which is faster. But they let me do my full text searches again so it's all good!

Stan
Sergey Tkachenko
Site Admin
Posts: 17559
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Re: Fastest Way to Get All RVF Text

Post by Sergey Tkachenko »

I think that the speed is about the same.
But if you use it for searching, RVGetTextRange is preferred, because it does not contain representation of non-text items.

Moreover, if you find a substring in text returned by RVGetTextRange, you can then find the position of this substring in document.
See the demo https://www.trichview.com/forums/viewto ... f=3&t=9278
standay
Posts: 274
Joined: Fri Jun 18, 2021 3:07 pm

Re: Fastest Way to Get All RVF Text

Post by standay »

I wound up using RVGetTextRange. Yes, using it for searches of a set of rvf files. I run through them and add the text to a virtual stringtree, then I search the tree. That way, the first search take a few seconds, but after that they are very fast.

Stan
Post Reply