You are here

Just Want to Confirm

6 posts / 0 new
Last post
Tarot Redhand
Just Want to Confirm

Am I right in thinking that the only characters that are legal to use in a 2da file are those whose ASCII value lies in the range of 32 (0x20h (the space character))  to 126 (0x7Eh ('~')). This is ignoring the end of line charcters 'Carriage Return' and 'Line Feed' and also TAB which I always replace with 4 spaces anyway.

Thanks in advance.

TR

  • up
    50%
  • down
    50%
Vanya Mia

If you put them between inverted commas it doesn't seem to care what you put in a name, bar the three specifically mentioned by you. I've definitely seen /, & and * which would normally be rejected. The filename needs to require traditional filename convention to be followed though, and reject spaces, parentheses, control characters or anything that's used in mathematical formulae and logic calculations. Basically it seems only numbers, letters and the _ are allowed, so that's likely to apply to all columns that don't have specific formatting too.

"I took Skill Focus: Craft Disturbing Mental Image as my feat last level." Belkar, OOTS

  • up
    50%
  • down
    50%
Tarot Redhand

That was what I was thinking too. The reason I am asking is because I am considering writing a 2da error detecting/correcting tidying up program after having seen that 2da file. Some errors which are easy enough for a human to detect (once a line is pointed out to be wonky) and/or fix pose real problems to do computationally. One example being where the closing quote is supposed to go when it is detected that it is missing (easy to detect by counting quotation marks - odd = missing) . 2 related examples are lines with missing columns and lines with too many columns. Given the variability of length that good lines can potentially have, detecting these 2 can also be problematic. The best idea I have had so far is to create a rolling average of the lengths and compare with the next line in the sequence. If the length is out by say 10% then flag it up to the user to check it visually. By a rolling average I mean one that is continuously calculated from all the preceding lines.

TBH if anyone has got any better ideas I'll definitely consider them otherwise it will produce a an error log file with a list of line numbers to check.

TR

  • up
    50%
  • down
    50%
HipMaestro
HipMaestro's picture

This is an interesting project you've embarked upon, Tarot.  Kudos for all the research thus far. yes

A question:  Do 2das always use the same delimiter?  I am assuming a TAB but have no way to confirm this (never opened one as hex or binary myself). 

Typical spreadsheets can be delimited using whatever character (or metacharacter) seems plausible but the choice will restrict what format individual strings can contain and the engine's "digestion" even moreso.

Madness... a blessing for the chosen few.

  • up
    50%
  • down
    50%
4760

If it's the same as NWN2, the delimiter is tab or spaces (whatever their number).

Wouldn't it be more reliable to count the number of tabs, or of pack of spaces, and deduct the number of columns that way? The issue is however that a space found after an opening " (i.e., odd number of ") should not be taken into account, and as soon as the number of " becomes even again, all the spaces after are part of delimiter (which means that several spaces together only count for one).

  • up
    50%
  • down
    50%
meaglyn

I wrote something like this a while back.  It's a perl tool so it may not be as easy to run on windows, but should work with a cygwin setup. it will validate proper number of columns and such for each row based on the headers. The one issue it handles poorly is missing half of a quoted name with spaces.  But if you get unexpected results that a clue.  It can also normalize the spacing and fix the line numbering and remove extra/missing columns. Another useful feature is that it can diff 2 files andreport what's changed.   You can get it here.

Fwiw, this works by making the file actually into a 2 dimentional array. That makes it trivial to detect if each row has the right number of elements.  It does rely on the header row to be corrent. But because it can be told to fix missing elements (****) it makes it easy to a new column.

There is a spec for the 2da format in the Bioware developer docs, which I think Proleric (or was that you TR, I don't memember) re-posted recently.

Cheers,

Meaglyn

  • up
    50%
  • down
    50%