From 916b0f4a081b26ebb4b58afbcb651fc74dbe23cc Mon Sep 17 00:00:00 2001 From: Kaz Kylheku Date: Wed, 19 Feb 2014 00:38:06 -0800 Subject: Fixing a long-running issue in the TXR pattern language: premature opening of files, prior to directives that actually need data. The documentation basically lied that this is the case: namely, the text "A file isn't opened until the query demands material from that file, and then the contents are read on demand, not all at once." This is now a fact. * match.c (non_matching_directive_table): New global variable. (open_data_source): New static function. Contains an almost verbatim migration of the source-opening logic that used to be in match_files. The useless assignment to c->nil is gone, and c->data == t is explicitly tested for. Instead of assuming that only the @(next) directive does not need to have a data source open, the table of non-matching directives is consulted. Opening the data source is now skipped for numerous directives. (match_files): Call open_data_source within the loop. This means that even after processing numerous non-matching directives, we will still correctly set up the data lazy list. (dir_tables_init): Initialize non_matching_directive_table, protect from GC and populate with numerous directives. * txr.1: Improved documentation for @(next :args), and removed a description of the hack that a single @(next) at the top of the query suppressed the opening of the data source. --- txr.1 | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) (limited to 'txr.1') diff --git a/txr.1 b/txr.1 index f9c50347..90d45953 100644 --- a/txr.1 +++ b/txr.1 @@ -1461,13 +1461,13 @@ match failure. The variant @(next :args) means that the remaining command line arguments are to be treated as a data source. For this purpose, each argument is considered to be a line of text. If an argument is currently being processed as an input -source, that argument is included. Note that if the first entry in the argument -list is not intended to name an input source, then the query should begin with -@(next :args) or some other form of next directive, to prevent an attempt to -open the input source named by that argument. If the very first directive of a query is any variant of the next directive, then -.B TXR -avoids opening the first input source, but it does open the input source for -any other directive, even one which does not consume any data. +source, that argument is included at the front of the list. As the arguments +are matched, they are consumed. This means that if a @(next) directive without +arguments is executed in the scope of @(next :args), it opens the file named +by the first unconsumed argument. + +To process arguments, and then continue with the original file and argument +list, wrap the argument processing in a @(block). The variant @(next :env) means that the list of process environment variables is treated as a source of data. It looks like a text file stream -- cgit v1.2.3