Taking out bits of CSVs
I am new to programming and I have a bunch of CSV files that are about 50
to 60 rows each. After an unspecified amount of rows, there is a string
called 'NAME' in the second column. I want to take everything in the
second column after 'NAME' and print it to a text file. I initially
thought to use pandas and skiprows to do this however the problem is that
each csv I run through will have 'NAME' in a different row. Also, if it
helps, there is a blank line three rows before 'NAME' in every CSV.
NUMBER,ITEM
N1,Shoe
N2,Heel
N3,Tee
N4,Polo
N5,Sneaker
N10,Heel
N11,Tee
...
...
...
How
Count 17
SORT,NAME
H1,Thing
H2,WTANK
H3,TEE2
H4,TEE
I would also like to not have repeats in the text file because I will be
running through around 1000 CSVs in a directory. Here is a bit of code
that I started out with and is where i got stuck.
import pandas as pd
import csv
import glob
fns = glob.glob('*.csv') #goes through every CSV file in director
for csv in fns:
prod_df = pd.read_csv(csv, skiprows=???)
with open (os.path.join('out', fn), 'wb') as f:
w = csv.writer(f)
test_alias = prod_df['NAME'].unique()
w.writerow(row)
I know it doesn't work, and is probably not a very good bit of code. Any
help would be greatly appreciated. Thank You!
No comments:
Post a Comment