|
|||||||
Those of you who've been in the industry for a while will remember the smugness with which we treated the Y2K issues the rest of the world were experiencing. It didn't affect us, we stored our dates in internal numerical format. I'm sure we all have favourite anecdotes. Mine was being called into the Press Office of a major Government department to certificate their AREV tracking system. I sat down and gave the system a cursory glance. "Ermmm - it doesn't seem to use dates?". "No, it doesn't". "Well, here's your certificate"... turns out they couldn't get the budget for system tweaks but they could for Y2K compliance. I spent the rest of the day implementing their desired changes. Anyway, this week wiped the smile off of our collective faces (if you'd ignored the advice in KB 42 thirty or so years ago - which to be honest you could be forgiven for) with the advent of our very own variant - we'll call it the 20100 bug, although it's not a bug, it's an unfortunate feature. Users began reporting that date searches were failing for values after the 10th of January 2023. At first we couldn't see an obvious reason. We built a database containing date values going back to the previous century with five rows per date and indexed it. We wrote a test program and ran it RUN ZZ_POP 10/01/2023 RUN ZZ_POP 11/01/2023 RUN ZZ_POP 20099 What we've fallen foul of is discussed in the KB reference earlier. Basically BTREE.EXTRACT takes what it is given and tries to ICONV it to use for the look up. If it can't ICONV it (20099 isn't a viable date) it uses the value passed. If it CAN ICONV it, is uses the ICONVed value. And guess what? 20100 ICONVs to the 2nd of January 2000 as you can see above. Of course, this won't be the first time this has happened in the wild - for example looking for the 7th of December 1998 using 11299 would have returned the 1st of December 1999 and so on. Note that the same issue will be experienced when using internal date formats with RLIST statements. The solution? When calling BTREE.EXTRACT use EXTERNAL values as it will try and ICONV the data before using it. AddendumWe've been asked to share the code we used to locate btree.extracts in client systems to enable a manual inspection to determine possible failure points. This is a rough and dirty hack which met our requirements. Feel free to tailor to your own requirements. 0001 Subroutine zz_find_Btree_Extract( void )
0002 /* 0003 Author AMcA 0004 Date Jan 2022 0005 Purpose Quick hack to help identify system usage of btree.extract 0006 Provided as is with no warranty 0007 */ 0008 0009 Gosub initialise 0010 Gosub process 0011 0012 Return 0013 0014 initialise: 0015 0016 filesToCheck = "SYSPROCS,SYSREPOSEVENTS,SYSTABLES" 0017 columnsToCheck = ",,8" 0018 0019 loopCtr = Dcount(filesToCheck, ",") 0020 resultString = "" 0021 0022 Return 0023 0024 process: 0025 0026 For loopPtr = 1 To loopCtr 0027 file = Field( filesToCheck, ",", loopPtr ) 0028 column = Field( columnsToCheck, ",", loopPtr ) 0029 0030 resultString := file : \0D0A\ 0031 0032 if file = "SYSTABLES" Then 0033 starting = "%" 0034 Gosub processSysTables 0035 End Else 0036 starting = "@" 0037 Gosub processRest 0038 End 0039 0040 resultString := \0D0A0D0A\ 0041 0042 Next 0043 0044 Oswrite resultString On "ZZ_BE.TXT" 0045 Call Set_Property("CLIPBOARD", "TEXT", resultString ) 0046 0047 Return 0048 0049 processRest: 0050 Open file To v Then 0051 Gosub processV 0052 End Else 0053 Call FSMsg() 0054 End 0055 Return 0056 0057 processV: 0058 0059 Select v 0060 eof = 0 0061 Loop 0062 Readnext id Else eof = 1 0063 Until eof Do 0064 If id[1, 1] = starting else 0065 Read row From v, id Then 0066 ptr = 1 0067 If column Then 0068 saveRow = row 0069 row = row< column > 0070 Convert @Vm To @Fm In row 0071 end 0072 lenRow = Len(row) 0073 lineNo = 0 0074 If lenRow > 0 then 0075 Loop 0076 nextline = row[ptr, @Fm] 0077 ptr = Col2() + 1 0078 lineno += 1 0079 there = IndexC( nextLine, "btree.extract(", 1) 0080 If there Then 0081 variable = Trim( nextLine[ there + 14, ","] ) 0082 If variable[1, 1] = "'" Or variable[1, 1] = '"' Then 0083 // passing literals - check the line 0084 resultString := file : \09\ : id : \09\ : lineNo : \09\ : nextLine : \0D0A\ 0085 End else 0086 // ok we're now going to work back through the code until we find = for our var 0087 tempLineNo = lineNo 0088 Loop 0089 tempLineNo -= 1 0090 line = row< tempLineNo > 0091 If Trimf( line )[1, 1] = "*" Then 0092 line = "" 0093 End 0094 Convert " " To "" In line 0095 line = " " : line 0096 Until IndexC( line, " " : variable : "=", 1) 0097 Until tempLineNo = 0 0098 Repeat 0099 If tempLineNo = 0 then 0100 resultString := file : \09\ : id : \09\ : lineNo : \09\ : "Not found " : row< LineNo > : \0D0A\ 0101 End else 0102 resultString := file: \09\ : id : \09\ : lineNo : \09\ : row< tempLineNo > : \0D0A\ 0103 end 0104 end 0105 End 0106 While ptr < lenRow 0107 0108 Repeat 0109 End 0110 End Else 0111 end 0112 Call send_info( file : " " : id ) 0113 end 0114 repeat 0115 return 0116 0117 processSystables: 0118 0119 Open file To v Then 0120 Select v 0121 saveV = v 0122 eof = 0 0123 dictCtr = 0 0124 0125 Loop 0126 Readnext id Else eof = 1 0127 Until eof Do 0128 If ID[1, 5] = "DICT." Then 0129 Open id To v Then 0130 0131 file = id 0132 0133 Call push.Select( v1, v2, v3, v4 ) 0134 Gosub processV 0135 Call pop.Select( v1, v2, v3, v4 ) 0136 eof = 0 0137 0138 End Else 0139 * Call FSMsg() 0140 End 0141 v = saveV 0142 end 0143 Repeat 0144 End Else 0145 Call FSMsg() 0146 End 0147 0148 return |
|||||||
| |||||||
13 Comments:
We ran into this problem yesterday. Could Revelation write a patch to BTREE.EXTRACT to account for 01/11/2023 issues?
By Anonymous, At 12 January 2023 at 15:06
Very helpful blog posting. Please note as well that the same issue applies to r/list statements - if you are used to using internal dates for search/selection criteria, you will have the same problem with incorrect results being returned. You MUST use externally formatted dates for date field comparisons in r/list as well - if you want to use internal date values instead, you should create a dictionary field that does not have a D conversion code and use that for your selection/search criteria...
By Anonymous, At 12 January 2023 at 15:27
It's a tough call. A fix was made to AREV when the code base split to OI and AREV (as alluded to in the KB article) but you'd be altering 30 years of behaviour and who knows what that might break down the line? Rock and a hard place I'd say. I just wrote a quick program to scan our client's code, identify all btree.extract calls, parse out the variable and scan backwards to its assignation. Put all of that info into a spread sheet and quickly identified that I only had a few programs to manually check.
By Sprezz, At 12 January 2023 at 15:29
I'm of the very strong opinion that you should never use ICONV values in selects.
There's no reason for it.
The conversion code is only executed once so any perceived speed hit is inconsequential.
Creating shadow dictionaries on indexed fields is just asking for trouble, or duplicating the index. There's just no reason for it, other simply being able to pass internal values inside your code.
I've posted about this many times in the past, and always ask this same question. Why should the system jump through hoops to allow you to pass an internal date, when no one would ever think of passing an internal value for an MD2 field or other conversion. It's only dates people want to pass internal. What's so special about dates?
It's just a bad idea all around, but I've love to hear the reasons for doing it this way.
By Aaron K (Sprezz), At 12 January 2023 at 16:23
Revelation will net be writing a patch for this. We do not have a solution that works 100% of the time.
Thanks-
Mike Ruane, Revelation Software
By Mike Ruane, At 12 January 2023 at 20:05
Is this the same bug that grounded all aircraft and affected the NOTAM system this week perhaps?
By Anonymous, At 12 January 2023 at 21:29
I started out doing it as oconv. But then I had the thought that I was wasting my time oconving it all the time, when I could just send in the raw, internal date. So, I tried that and it worked, so since then I have kept on not oconving it. So my instinct was to oconv it, but then I thought it was a waste of time, and because iconv worked, I went with it.
By josh, At 12 January 2023 at 23:11
"I'm of the very strong opinion that you should never use ICONV values in selects.
There's no reason for it."
It's an arbitrary decision. But what's important is that what is allowed is well understood and that the system enforces it.
By Anonymous, At 16 January 2023 at 13:02
"What's so special about dates?"
Because keywords and dates are 99% of what people search for in rlist/btree extract. I've never searched on a monetary value except 0.
By Anonymous, At 16 January 2023 at 13:08
Probably what should have happened here is that rev should have added a patch that retained the current behaviour but created a log whenever an rlist failed to convert a date. Then people could have found out about it ages ago and fixed it ages ago. But where would rlist log the data to. OI doesn't really have a standardised place where it puts logs. It probably should though.
By Anonymous, At 16 January 2023 at 22:59
I'm going to assume these were all written by the same author.
"Because keywords and dates are 99% of what people search for in rlist/btree extract. I've never searched on a monetary value except 0."
We must work in very different industries.
"It's an arbitrary decision. But what's important is that what is allowed is well understood and that the system enforces it. "
"Probably what should have happened here is that rev should have added a patch that retained the current behaviour but created a log whenever an rlist failed to convert a date. Then people could have found out about it ages ago and fixed it ages ago. But where would rlist log the data to. OI doesn't really have a standardised place where it puts logs. It probably should though."
I can't promise that it's well understood, but Revelation did release a technical bulletin about this in 1995, so around 27 years. This issue has been my bugbear for quite some time. There was a message thread on Rev's forum about this around a year ago, and I gave a pretty detailed explanation. Trust me, I've not been at all shy on talking about this. Not shy at all.
Basically, the system works like this:
The selection (or reduction, as it's called inside the system) criteria are parsed and broken down field by field. When the reduction values and dictionary fields are linked up, the system reads the dictionary and checks if there's an OCONV value. If there is, the system performs an ICONV using the OCONV setting and uses that value. If the ICONV value is null or failed, then the system uses the passed in value.
The system always performs the check. If I remember correctly, the system stores off the passed value, does the iconv on the original, and then resets if it's a failure. So, if anything, the system is actually slower because of the additional assignment.
I think people look at this from the wrong way around, as if it's all designed for code. It's not. It's really designed for RLIST. The code is just there to make the sentence processing easier. That's why it's RTP18.ENGLISH. It's meant to happen in English. It's all meant to be
SELECT SOMEFILE WITH DATE_FIELD = "12-25-2020"
and not
SELECT SOMEFILE WITH DATE_FIELD = 20093
or whatever the date is, because I'm not bothering to look it up because it doesn't mean anything to anything other than a simple way to track dates numerically to make it easier to do date math.
The point is, they have this down at a low level of the system, so they don't have to do anything to the data until it gets all the way down to the reduction engine. At that point, it does the conversion. Otherwise, there would need to be separate code for the three basic selection syntax engines in Rev; RLIST (or ENGLISH, the internal name), the REDUCE routine and BTREE.EXTRACT. Even then, all index selects go into the various BTREE routines, which are all called through BTREE.EXTRACT, where the conversion occurs. (It's actually a bit more complicated than that, but this is a comment, not a tech article, and it's not really my place to give out the recipe for the secret sauce). It should be obvious that it's an unintended feature of the system that it ever works at all. I thought it was anyway, and that's before I looked through the code when I was at Rev.
I don't know. You can code your systems as you want as you feel best. All I can tell you is how the system works and you can decide how efficient or inefficient you want your system to be.
By APK, At 17 January 2023 at 15:34
"I think people look at this from the wrong way around, as if it's all designed for code. It's not. It's really designed for RLIST. The code is just there to make the sentence processing easier. That's why it's RTP18.ENGLISH. It's meant to happen in English."
I disagree. If you allow humans to enter text into an rlist, and you don't check that text in your code, then you become vulnerable to rlist injection (like sql injection). So, you should never allow a human to enter anything into an rlsit without checking it in your code. T he simplest way to check the date is safe is to convert it to internal format to see if the conversion fails.
By Anonymous, At 17 January 2023 at 22:41
I don't think anyone is advocating for passing un-verified criteria to Rlist from an untrusted source. The point here is that Rlist is a human-readable sentence parser, so it stands to reason that the data it's intended to work on is also human-readable. That fact that someone made it support internal formats as well as the human-readable ones is the problem, because now we have added a dose of ambiguity into the mix.
Allowing humans to enter "raw" Rlist statements is perfectly acceptable in the correct context - allowing users access to TCL for example is a valid feature of many systems, but those are from trusted sources such as a user's workstation and not a web-browser.
By Captain C, At 18 January 2023 at 10:09
Post a Comment
Subscribe to Post Comments [Atom]
<< Home