I am writing the output of my code to the .csv file. There are three directories each directory contains 50-files. I want to write the output of each directory files in different column. LIKE;
group1 group2 group3 file1 1445 89 87 file2 1225 100 47 file3 650 120 67 file4 230 140 97I have following code to do so,
from collections import Counter import glob import os out= open( 'output.csv','a') out.write (';''group-1') out.write (';''group-2') out.write (';''group-3') out.write('\n') i = 1 while i<=50: out.write( "file-%d" %i ) out.write('\n') i+=1 i=1 path = 'group/group-*-files/*.txt' files=sorted(glob.glob(path)) c=Counter() for filename in files: for line in open(filename,'r'): c.update(line.split()) for item in c.items(): oi=("{}\t{}".format(*item)) out_array = oi.split() if out_array[0]=='00000000': out.write(out_array[1]) out.write('\n') c.clear()The problem I am getting and did not able to solve, the answer starts writing in the first column after file number 50
file48 file49 file50 1445 1225 ..I want to write first 50 answers under group1 column, next 50 in group2 and last 50 in group3
final output looks like,
group1 group2 group3 file1 145 89 87 file2 850 100 47 file3 650 120 67 file4 230 140 97asked Jun 29, 2017 at 13:49
9
This is how I would rewrite your code. The changes I made are:
- Use the with statement when opening files to make sure they get closed
- Use the csv module to make writing the csv file easier
- Write the whole line at once by building one line at a time before writing it to the file.
Since I don't really know what is in your files, this isn't thoroughly tested.
import csv from collections import Counter import glob import os with open( 'output.csv','a') as out: writer =csv.writer(out, delimiter='\t') writer.writerow(['']+['group{}'.format(i) for i in range(1, 4)]) path = 'group/group-*-files/*.txt' files=sorted(glob.glob(path)) c=Counter() for i, filename in enumerate(files): line = ['file-{}'.format(i)] with open(filename) as infile: for line in infile: c.update(line.split()) for key, count in c.items(): if key == '00000000': line.append(count) writer.writerow(line) c.clear()answered Jun 29, 2017 at 14:21
You have at least one problem with the wrong indentation. You firstly generate all file names by this:
... while i<=50: out.write( "file-%d" %i ) out.write('\n') # replace \n to column delimiter \t i+=1And than you begin process the files. You delete this line i=1 and all other text must start at the same indentation as out.write
from collections import Counter import glob import os out= open( 'output.csv','a') # flag a - Do you want append to existing file ? out.write('file;group-1;group2;group3') # You forget column 1 - filename # out.write (';''group-1') # out.write (';''group-2') # out.write (';''group-3') # out.write('\n') i = 1 while i<=50: out.write( "file-%d" %i ) # out.write('\n') out.write(';') # Insert character for column delimiter i+=1 # i=1 Delete, because will cause infinite loop # Following code must run inside while loop, indent to the same level # as previous lines path = 'group/group-*-files/*.txt' files=sorted(glob.glob(path)) c=Counter() for filename in files: for line in open(filename,'r'): c.update(line.split()) for item in c.items(): oi=("{}\t{}".format(*item)) out_array = oi.split() if out_array[0]=='00000000': out.write(out_array[1]) # out.write('\n') - You don want create new lines, but only new columns for every group out.write(';') c.clear() out.write('\n') # New line - new record
marc_s
713k171 gold badges1314 silver badges1433 bronze badges
answered Jun 29, 2017 at 14:29
for filename in files: for item in c.items(): oi=("{}\t{}".format(*item)) out_array = oi.split()` for filename in files: for line in open(filename,'r'): c.update(line.split()) for item in c.items(): oi=("{}\t{}".format(*item)) out_array = oi.split()
toku-sa-n
6881 gold badge6 silver badges23 bronze badges
answered Jan 28 at 11:00
1