I am struggling to fight to do so in R. I have data from a set of ~ 50 csv files, details of each person's books sales transaction:
** weeks 1 ** ** author ** ** title ** ** customer ID ** Author 1 Title 1 1 Author 1 Title 2 2 Author 2 Title 3 3 Author 3 Title 4 3 ** Week 2 ** ** Author ** ** Title ** ** Customer ID ** Author 1 Title 1 4 Author 3 Title 4 5 Author 4 Title 5 1 Author 5 Title 6 6 ... ~ 50 weeks, each from a different CSV file I received a new table Do not want to go, each row represents a writer who appears in the full data set, and each ~ I have the data with the 50-week column. Each cell should have the number of sales of that author's book in that week. The number of rows in that week's sales file can be summarized briefly in that file, so it should look like this:
** Author ** ** Week 1 ** ** Week 2 ** ... ** Week 50 ** author1 2 1 ... author2 1 0 ... Author 1 1 1 ... Author 4 0 1 ... author5 0 1 ... ... Any ideas? I know how to get a list of unique authors to create the first column. And I know how to load sales data for each week in the data frame but I need help in automating this process: 1) Repeats on unique authors. 2) Walking on a CSV file or data frame of every week 3) Summary of sale for that author of that week 4) Adding count as the value of that cell
Can anyone help? Thank you: -)
text1 & lt; - "** Week 1 ** ** Author ** * ** Title ** ** Customer ID ** Author 1 Title 1 1 Author 1 Title 2 Author 2 Title 3 3 Author 3 Title 4 3" df1 & lt; -read.table (header = t, skip = 1, stringsfactors = f, text = text1) week 1 & lt; - Read.table (header = F, nrows = 1, stringfactors = F, text = text1, sep = ";") Week 1 & lt; -substr (week 1, 3, nchar (week 1) -2) df1 $ week & lt; -rep (Week 1, Nero (DF1)) text 2 & lt; - "** Week 2 ** ** Author ** ** Title ** ** Customer ID ** Author 1 Title 1 4 Author 3 Title 4 Author 5 Title 5 1 Author 5 Title 6 6" df2 & lt; -read.table (header = t, leave = 1, stringfactors = f, text = text2) week 2 & lt; -read.table (Header = F, NROs = 1, String Factors = F, Text = Text2, Sep = ";") Week 2 & lt; -substr (week 2, 3, nature (week 2) -2) df2 $ week & lt; -rep (week 2, nero (df2) df & lt; -rbind (df1, df2) name (df) & lt; -c ("author", "title", "customer id", "week") is required (plyr) agg & lt; -ddply (df, ~ author * week, requirement of the function (df) length (df $ headline)) is needed (reshape) res & lt; -cast (agg, author ~ week, value = "V1", fill = 0) REC writer VK 1 week 2 1 author 1 2 1 2 Author 2 1 Author 3 1 Author 4 1 Author 4 0 1 5 Author5 0 1 You only need one loop to read it in your data, for that you
ff < -list.files (pattern = "*. [Cc] [ss] [vv]") can use something (i in 1: length (ff)) {data to read and data. }
Comments
Post a Comment