Working with Excel / CSV Files
automate-excel-reporting-with-python 自動化處理
import csv file_read = open('myfile.csv', 'rb') reader = csv.reader(file_read, delimiter=',') for row in reader: print row[0], row[1]
# Data Science Cookbook p.222 import unicodecsv as csv
差別在於 UnicodeReader, UnicodeWriter
計算 CSV 的行數,小檔案可用 row_count = len(list(csv.reader(open('myfile.csv')))) 但大檔案要用 row_count = sum(1 for row in csv.reader(open('myfile.csv'))),另外也要注意欄位值可能包含 \n 造成的問題。
預設不會判斷資料型別,統一使用 String 格式,可寫程式猜測型別。
Exporting
collective.excelexport exporting folder listing
例如想要匯出 myfolder 目錄的項目清單,在 ZMI 的 myfolder 目錄裡,新增 Python Script,以名稱 listing 為例,內容如下:
request = container.REQUEST response = request.response for brain in context.getFolderContents(): print brain.Title + ', ' + brain.Creator response.setHeader("Content-type","application/vnd.ms-excel") response.setHeader("Content-disposition","attachment;filename=FolderData.xls") return printed
在網址裡使用 myfolder/listing 就會出現另存檔案的視窗。
Mongo Query to CSV Download (ExpressJS)
Example Content Type
with MySQL
pyExcelerator by Tareq Alam blog
data = map(lambda x: {'date': x[0].value, 'campaign': x[1].value, 'users': x[2].value, 'sessions': x[3].value}, ws[ROW, ws.max_row])
Exporting ZSQL Query
Importing
openpyxl vs xlrd 讀取大檔 (例如10Mb 費時約 1分鐘) 需要記憶體,不然會當機。
from openpyxl import load_workbook wb = load_workbook('my_excel.xlsx', read_only=True) print wb.get_sheet_names() for i in wb.get_sheet_names(): print i ws = wb.get_sheet_by_name('Sheet1') # for old versions
# print ws.get_highest_row(), ws.get_highest_column()
print ws.max_row, ws.max_col cell = ws['A1'] cell.value cell.row cell.column for i in range(1,9): print i, sheet1.cell(row=i,column=1).value
excel template xlwings: Excel automation / interaction
import pandas as pd data = pd.read_excel('my-excel.xlsx', sheetname=0)
Filter Rows Containing a String Pattern
del df['column_name'] vs del.df.column_name
Pandas 讀取舊版 Excel 時,底層使用 xlrd 但仍可能遇到 No CODEPAGE record, no encoding_override: will use 'ascii' 錯誤。