forums.silverfrost.com Forum Index forums.silverfrost.com
Welcome to the Silverfrost forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Reading and writing Tabs

 
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> General
View previous topic :: View next topic  
Author Message
DanRRight



Joined: 10 Mar 2008
Posts: 2818
Location: South Pole, Antarctica

PostPosted: Sat May 29, 2010 9:56 pm    Post subject: Reading and writing Tabs Reply with quote

When you read the character text which has tabs in it and then write it back tabs are converted into spaces. How to preserve tabs in the output? Example is here: if file a.ini has tabs they disappear in b.out
Code:
   CHARACTER*128 text

   open(111,file='a.ini')
   open(112,file='b.out')

   do i=1,1000
     read (111,'(a)',end=1000) TEXT
     write(112,'(a)') TRIM(TEXT)
   enddo
1000 close(111)
     close(112)
   end       


Last edited by DanRRight on Sat May 29, 2010 10:03 pm; edited 1 time in total
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Sun May 30, 2010 1:06 am    Post subject: Reply with quote

Dan,

The problem is with reading tabs. You can control the reading of tabs, to not get spaces. See OPEN
Quote:
Reading tab characters
In a Fortran READ statement, by default, tab characters read from a file are converted to spaces. To avoid this conversion you should make a call to the subroutine READ_TABS@(unitno) immediately after the OPEN statement (unitno is the unit number of the stream).

I think that when writing tabs, no conversion is applied.
John
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2818
Location: South Pole, Antarctica

PostPosted: Sun May 30, 2010 6:13 am    Post subject: Reply with quote

Holly@#$...i did not that.... this compiler library can do literally anything.

Thank John. I was losing the whole day stopped by this problem with no good solution in mind and already was thinking to ask misc.lang.fortran ... and lose more time...because it looks like there is no general portable solution in Fortran if compiler-specific routine with @ was invented for this purpose
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2818
Location: South Pole, Antarctica

PostPosted: Tue Jun 01, 2010 12:49 am    Post subject: Reply with quote

Well...Got new problem with the tabs.

When i try to find the position of TAB in the text string using INDEX, i get wrong result.

To reproduce this, one can use same demo code as above, modified a bit to include read_tabs@. Code just reads and writes line of text from one file into another. And is doing that so that both a.ini and b.out files are identical.

Wrong becomes the text manipulation with the strings which have tabs.
If the file a.ini consists of arbitrary text and one or several tabs, the INDEX, which must bring us position of the first tab iPosOfTab, produces something else which depends of the entire text of the line. Is this "gray area" of the Fortran standard or just the bug?

From the other hand, if we remove read_tabs@ (and lose the ability to write exactly the same line back to the file b.out) the INDEX works fine, it finds position of the TAB correctly. The miracle is that TAB is destroyed being substituted with spaces but actually it is not until the text is written to the file. Very hidden mechanics...

Code:
   CHARACTER*128 text

   open(111,file='a.ini')
   call READ_TABS@(111)
   open(112,file='b.out')

     read (111,'(a)',end=1000) TEXT
     write(112,'(a)') TRIM(TEXT)

1000 close(111)
     close(112)

     iPosOfTab = index(text,'   ')
     print*, iPosOfTab

   end 
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Tue Jun 01, 2010 6:02 am    Post subject: Reply with quote

Dan,

I would question your "wrong" assumption.

Try adding the following code to check what is in the line:
Code:
character c, tab
integer i,ic
do i = 1,len_trim(text)
  c = text(i:i)
  ic = ichar (c)
end do

Make sure the parity bit is not set, as this can happen with some files. ( tab = char(9+128))
Then you have to check your value of the sub string you are using. Potentially you should be able to find a tab in the text by using:
Code:
     tab = char (9) ! note tab is a character variable
     ipos = index (text,tab)

This should work !!

John
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2818
Location: South Pole, Antarctica

PostPosted: Tue Jun 01, 2010 8:02 pm    Post subject: Reply with quote

OK, in summary, it looks like this damn problem was in the way the Tab was defined. Seems we can not use the Tab on the keyboard like in my example above
Code:
     iPosOfTab = index(text,'   ') ! here is Tab from keyboard

This does not work by some not yet clear to me reason. The way it works is to define Tab via char as you have suggested

Code:
     iPosOfTab = index(text,char(9))


THANKS John
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Wed Jun 02, 2010 1:09 am    Post subject: Reply with quote

I think that your code
Code:
     iPosOfTab = index(text,'   ') ! here is Tab from keyboard
would suffer from the interpretation your IDE/editer placed on the tab key when pressed. You could set this text string to a variable, say character*10, and print out the values.
I do sometines write out numerical results in a tab delimited format for excel, again using char(9). It is unfortunate that there is such an unpredictable result from tabs, ( which is probably why .csv are more common/useable than .tsv file formats. How do Europeans cope with trying to use 123,2 instead of 123.2 ? )
John
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2818
Location: South Pole, Antarctica

PostPosted: Wed Jun 02, 2010 4:01 am    Post subject: Reply with quote

Tab in my example should not depend of its editor representation in the text. Tab is tab. The CHAR(9). One symbol. Period.

Should but i am not sure it always is. What did you mention about parity bit and how to set it differently?
Back to top
View user's profile Send private message
IanLambley



Joined: 17 Dec 2006
Posts: 490
Location: Sunderland

PostPosted: Wed Jun 02, 2010 12:24 pm    Post subject: Reply with quote

John,
When I worked in Norway, the .csv files used comma as a decimal point, and semi-colon as the separator. For thousand separators, they use the decimal place. You would need to write a routine to translate these characters appropriately, taking into accout quoted strings.

For example:

Norway/Europe
3.005,32;"hello, Goodbye";250,33

Needs to be translated to:
3,005.32,"hello, Goodbye",250.33

and then handled like a British/American csv file.


Code:

character* 100 line_in
line_in='3.005,32;"hello, Goodbye";250,33'
call swap_euro_csv(line_in)
end
subroutine swap_euro_csv(line_in)
character*(*) line_in
character*200 line_out
character*3 swaps(2)
data swaps/',;.' , '.,z'/
logical in_quote
in_quote=.false.
iout_pos = 0
do i=1,length=leng(line_in)
  if(line_in(i:i) .eq. '"')then
    in_quote = .not.in_quote
  endif
  if(.not.in_quote)then
    iswap = index(swaps(1),line_in(i:i))
  else
    iswap = 0
  endif
  if(iswap .le. 2)then
c
c we don't want to swap a Euro dot for a comma as this will confuse the delimiter in Brit mode
    iout = iout + 1
    if(iswap .eq. 0)then
      line_out(iout:iout) = line_in(i:i)
    else
      line_out(iout:iout) = swaps(2)(iswap:iswap)
    endif
  endif
enddo
line_in = line_out(:iout)
end



Hilsen og farvel
Ian
Back to top
View user's profile Send private message Send e-mail
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Wed Jun 02, 2010 2:36 pm    Post subject: Reply with quote

Dan,
Quote:
What did you mention about parity bit and how to set it differently?

Other O/S set the 8th bit (was called parity bit) for characters, so their numeric value was in the range 129-255. I think you can still get this with files from some other O/S, but probably not the problem here.
These are special characters in windows and DOS.
John
Back to top
View user's profile Send private message
IanLambley



Joined: 17 Dec 2006
Posts: 490
Location: Sunderland

PostPosted: Wed Jun 02, 2010 5:26 pm    Post subject: Reply with quote

Parity bits were really for transmission of ASCII data in the olden days, when I was a lad. Terminals in those days only used characters from 0 to 127, the latter being the delete character or back arrow when printed. Since the advent of the PC, the full 255 charactres are now defined, and this leaves no room for using the parity bit as an error warning/correction method for transmision of data. Modern systems use a packet switching system, with cyclic redundancy checks (by modern I mean for the last 25+ years) and no use of individual character parity checking. It should not be a problem.

You need to use the char(n) method of defining any character in the ASCII character set below the value of 32 (20hex = space), and these are termed "non-printing" characters or control characters. A few useful characters from memory shown as the decimal/hex character number and the keyboard press originally used + the name are:

Code:

3/03h = ctrl+C = cancel (interrupts processing in DOS & Digital Equipment operating systems)
7/07h = ctrl+G = bell
8/08h = ctrl+H = backspace
9/09h = ctrl+I = tab
10/0Ah = ctrl+J = line feed
12/0Ch = ctrl+L = form feed - new page on printer.
13/0Dh = ctrl+M = Carriage Return or just Return or even Enter
17/11h = ctrl+Q = XON (transmission on , restarts computer sending to terminal/printer, used for flow control)
19/13h = ctrl+S = XOFF (transmission off , stops computer sending to terminal/printer, used for flow control)


"Even parity" meant that the parity bit was set to one to cause the total number of bits set to one in the character to be an even number. Similarly there is the less usual "odd parity" which was specially designed to drive people nuts when logging on to that type of system, with the computer thinking every character was faulty and the reply to the terminal also being interpreted by the terminal that the character was faulty. I hate odd parity.

I hope this helps.

Ian
Back to top
View user's profile Send private message Send e-mail
Display posts from previous:   
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> General All times are GMT + 1 Hour
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group