【发布时间】:2015-11-08 11:36:20
【问题描述】:
我正在尝试使用下面提到的代码使用 F# CsvTypeProvider 读取movies.txt:
type movies = CsvProvider<"../../movies.csv","/",InferRows=0,HasHeaders=false,IgnoreErrors=true,AssumeMissingValues=true,MissingValues="">
F# 将电影类型推断为:FSharp.Data.Runtime.CsvFile<System.Tuple<string,string,string>>
因此,仅读取前三列值。我知道它不是一个统一的 csv 文件,即每一行的列数都不相同。我想知道这个文件是否适合 CsvProvider。还有其他可以解析上述文件的类型提供程序吗?
样本记录
Akira (1988)/Louie, Detroit/Lindsay, Michael (II)/Martin, Dan (II)/Stone, Doug (I)/Blum, Steven Jay/Woren, Dan/Forest, Michael (I)/Wurst, Brad/Akimoto, Yôsuke/Cole, George C./Katô, Masayuki (I)/Prescott, Simon/Reynolds, Mike (I)/Held, Watney/Prince, Derek Stephen/Lembaw, Mike/Ôtake, Hiroshi/Lang, Lex/Kusao, Takeshi/Arakawa, Tarô/Bosch, Johnny Yong (I)/Strong, Sam (I)/Buckley, Ivan/Taggert, Jim/Hirano, Masato/Seth, Joshua/Sholder, Adam/Inagaki, Satoru/Sasaki, Nozomu/Buchholz, Bob/Joyce, Christopher (I)/Sorich, Michael/Hustin, Matthew/Lemay, Lewis/Thornton, Kirk/Nakamura, Tatsuhiko/Staley, Steve (II)/Grant, Dougary/McConnohie, Michael/Pinkham, Guy/Kishino, Yukimasa/Ishida, Tarô/Umezu, Hideyuki/Osborne, Jonathan C./Iwata, Mitsuo/Tanaka, Kazumi/Stellrecht, Skip/Kamifuji, Kazuhiro/Spellos, Peter/Pope, Tony/Lee, Peter (I)/Winant, Bruce/Price, Jamieson/Ikemizu, Michihiro/Clarke, Cam/Oliver, Tony (I)/Rae, Ted/Futamata, Issei/Axelrod, Robert/Murray, Ethan/Gurd Jr., Stanley/Ôkura, Masaaki/Romersa, Joe/Walters, Burt/Kramer, Steve (I)/Kitamura, Kôichi/Mercer, Matthew/Bassett, William/Suzuki, Mizuho/Kelso, Lee/Nitta, Sanshirô/Knight, William (III)/Genda, Tesshô/Wimberger, Kurt P./Plantagenet, Richard/Shioya, Kôzô/Hatch, W.T./MacKenzie, Cody/Bergen, Bob/Frierson, Eddie/Itô, Fukue/Phelan, Julie (III)/Brown, Emily (I)/Lane, Marilyn/Ferhardt, Josil/Darro, Bambi/Fujii, Kayoko/Thornton, Chloe/Ôno, Yuka/Goodson, Barbara/Gee, Jessica/Taylor, Julie Anne/Ruff, Michelle/Koyama, Mami/Tissier, Barbara/Cody, Lara/Fuchizaki, Yuriko/Lee, Wendee/Toyoshima, Masami/Ja Lee, Patricia/Forstadt, Rebecca/Tarulli, Lisa/Fox, Sandy (I)/Marshall, Mona (I)/Sarducci, Tony
Aladdin (1992)/Burton, Corey/Cummings, Jim (I)/Young, Philip/Williams, Robin (I)/Welker, Frank/Adler, Charles (I)/Gottfried, Gilbert/Kane, Brad (I)/Proctor, Phil/Gooch, Bruce/Seale, Douglas/Weinger, Scott/Angel, Jack (I)/Wahl, Chris/Houser, Jerry/Freeman, Jonathan (I)/Clarke, Philip L./Pinney, Patrick/Adler, Bruce/McGowan, Mickie/Taylor, Russi/Derryberry, Debi/Lockwood, Vera/Larkin, Linda/Lynn, Sherry (I)/Darling, Jennifer/Zielinski, Kathy/Salonga, Lea
记录基本上是正斜杠 (/) 分隔的字符串,其中包含电影名称和演员姓名。
【问题讨论】:
-
您可以分享输入文件中的示例吗?
-
谢谢。我已经包含了文件中的两条记录。
-
只是好奇,这个文件解析后你想怎么用,它的结构是title/actor#/actor#/....你没有header(当然可以加) 所以你需要使用无意义的列#?在添加缺少的分隔符后,此文件可以是 csv。但是将其视为普通 txt 文件并仅逐行读取,使用分隔符拆分,然后切片结果并将第一个元素视为标题,将剩余列表视为演员列表会不会更容易?
-
我正在尝试构建一个符号图,如此处所述:algs4.cs.princeton.edu/41undirected
标签: .net f# type-providers