Чтение файлов VertNet .txt

#r

#r

Вопрос:

Я начинающий пользователь R, поэтому заранее приношу свои извинения. Я пытаюсь прочитать файл данных .txt из VertNet.org в мое рабочее место. Я попытался загрузить файл .txt, как я успешно делал раньше, но этот метод не идентифицирует все столбцы данных, присутствующие в файле.

 Sciurus_carolinensis_total lt;- read.delim("Sciurus_carolinensis_total.txt")  

Ниже приведен код столбцов, которые мне нужны в моем конечном фрейме данных.

 Sciurus_carolinensis_total lt;- select(Sciurus_carolinensis_total, c(genus, specificepithet, sex, year, month, day, countrycode, stateprovince, county, decimallatitude, decimallongitude, lengthinmm, lengthtype))  

Ниже приведены несколько верхних строк файла .txt. Я не был уверен, будет ли это подходящим способом предоставить эту информацию. Дайте мне знать, если другой формат будет лучше.

 type modified license rightsholder accessrights bibliographiccitation references... PhysicalObject CC0 California Academy of Sciences. CAS Mammalogy (MAM). Recor... PhysicalObject 2009-06-05 15:46:10.0 CC0 California Academy of Sciences. CAS M... PhysicalObject 2019-08-25 10:42:36.0 http://creativecommons.org/publicdomain/zer... PhysicalObject 2019-09-18 16:45:02.0 http://creativecommons.org/publicdomain/zer... PhysicalObject 2018-02-28 https://creativecommons.org/publicdomain/zero/1.0/ ht...  

Приведенные выше данные усечены (каждая строка на самом деле содержит более 1400 символов) и на самом деле имеет встроенные вкладки, но HTML здесь этого не показывает. Вот результаты dput для этого текста, не усеченные:

 c("typetmodifiedtlicensetrightsholdertaccessrightstbibliographiccitationtreferencestinstitutionidtcollectionidtdatasetidtinstitutioncodetcollectioncodetdatasetnametbasisofrecordtinformationwithheldtdatageneralizationstdynamicpropertiestoccurrenceidtcatalognumbertrecordnumbertrecordedbytindividualcounttsextlifestagetreproductiveconditiontbehaviortestablishmentmeanstoccurrencestatustpreparationstdispositiontassociatedmediatassociatedreferencestassociatedsequencestassociatedtaxatothercatalognumberstoccurrenceremarkstorganismidtorganismnametorganismscopetassociatedoccurrencestassociatedorganismstpreviousidentificationstorganismremarkstmaterialsampleidteventidtfieldnumberteventdateteventtimetstartdayofyeartenddayofyeartyeartmonthtdaytverbatimeventdatethabitattsamplingprotocoltsamplingefforttfieldnotesteventremarkstlocationidthighergeographyidthighergeographytcontinenttwaterbodytislandgrouptislandtcountrytcountrycodetstateprovincetcountytmunicipalitytlocalitytverbatimlocalitytminimumelevationinmeterstmaximumelevationinmeterstverbatimelevationtminimumdepthinmeterstmaximumdepthinmeterstverbatimdepthtminimumdistanceabovesurfaceinmeterstmaximumdistanceabovesurfaceinmeterstlocationaccordingtotlocationremarkstdecimallatitudetdecimallongitudetgeodeticdatumtcoordinateuncertaintyinmeterstcoordinateprecisiontverbatimcoordinatestverbatimlatitudetverbatimlongitudetverbatimcoordinatesystemtverbatimsrstfootprintwkttfootprintsrstgeoreferencedbytgeoreferenceddatetgeoreferenceprotocoltgeoreferencesourcestgeoreferenceverificationstatustgeoreferenceremarkstgeologicalcontextidtearliesteonorlowesteonothemtlatesteonorhighesteonothemtearliesteraorlowesterathemtlatesteraorhighesterathemtearliestperiodorlowestsystemtlatestperiodorhighestsystemtearliestepochorlowestseriestlatestepochorhighestseriestearliestageorloweststagetlatestageorhigheststagetlowestbiostratigraphiczonethighestbiostratigraphiczonetlithostratigraphictermstgrouptformationtmembertbedtidentificationidtidentificationqualifierttypestatustidentifiedbytdateidentifiedtidentificationreferencestidentificationverificationstatustidentificationremarkstscientificnameidtnamepublishedinidtscientificnametacceptednameusagetoriginalnameusagetnamepublishedintnamepublishedinyearthigherclassificationtkingdomtphylumtclasstordertfamilytgenustsubgenustspecificepithettinfraspecificepithetttaxonranktverbatimtaxonranktscientificnameauthorshiptvernacularnametnomenclaturalcodettaxonomicstatusttaxonremarkstlengthinmmtlengthtypetlengthunitsinferredtmassingtmassunitsinferredtunderivedlifestagetunderivedsextdataset_urltdataset_citationtgbifdatasetidtgbifpublisheridtdataset_contact_emailtdataset_contacttdataset_pubdatetlastindexedtmigrator_versionthasmediathastissuetwascaptivetisfossiltisarchtvntypethaslength", "PhysicalObjectttCC0tttCalifornia Academy of Sciences. CAS Mammalogy (MAM). Record ID: urn:catalog:CAS:MAM:24943. Source: http://ipt.calacademy.org:8080/ipt/resource.do?r=mam (source published on 2019-07-23)thttp://portal.vertnet.org/o/cas/mam?id=urn-catalog-cas-mam-24943ttttCAStMAMttPreservedSpecimentttturn:catalog:CAS:MAM:24943t24943tCBC3tC. B. ClarkttfemaletadulttttttSNttttttSFSUtMeasurements: 17 1/2-7.9-2 1/4-0.75 in; no wt. Skin only.ttttttttttt1968-10-31tt305t305t1968t10t31t31 Oct 1968ttttttttNorth America; USA; California; San Francisco Co.tNorth AmericattttUSAttCaliforniatSan Francisco Co.ttNear North Lake, Golden Gate Park, San Francisco.tttttttttttt37.7700200000t-122.5014300000tNAD27t241ttt37.7700228000t-122.5014321000tttttKristina Yamamotot2002-12-01tMaNIS georeferencing guidelinestTerrain Navigator 5.03 USGS 1:24,000tunverifiedtttttttttttttttttttttttDouglas J. Longt2000-05-26ttttttSciurus carolinensis pennsylvanicustttttAnimalia; Chordata; Mammalia; Rodentia; SciuridaetAnimaliatChordatatMammaliatRodentiatSciuridaetSciurusttcarolinensistpennsylvanicustttttttt17ttotal lengtht1t0.75t1tadulttFttCalifornia Academy of Sciences. CAS Mammalogy (MAM). Source: http://ipt.calacademy.org:8080/ipt/resource.do?r=mam (source published on 2019-07-23)t6ce7290f-47f6-4046-8356-371f5b6749dft66522820-055c-11d8-b84e-b8a03c50a862tmflannery@calacademy.orgtMaureen Flanneryt2019-07-23t2019-08-04tno migratort0t0t0t0t0tspeciment1", "PhysicalObjectt2009-06-05 15:46:10.0tCC0tttCalifornia Academy of Sciences. CAS Mammalogy (MAM). Record ID: urn:catalog:CAS:MAM:28420. Source: http://ipt.calacademy.org:8080/ipt/resource.do?r=mam (source published on 2019-07-23)thttp://portal.vertnet.org/o/cas/mam?id=urn-catalog-cas-mam-28420ttttCAStMAMttPreservedSpecimentttWeight=482.9 g.; Length=455.0 mm.turn:catalog:CAS:MAM:28420t28420ttUnknownttfemaletAdulttexternal charactersttttTISSUE; SNtttttttWeight taken on 5 October 1995. Tail: 196.0 mm., HF: 60.0 mm., ear: 26.0 mm.ttttttttttt1993-08-24tt236t236t1993t8t24t24 August 1993ttttttttNorth America; USA; California; San Mateo Co.tNorth AmericattttUSAttCaliforniatSan Mateo Co.ttRedwood Citytttttttttttt37.4698300000t-122.2227100000tNAD27t9469ttt37.4698329000t-122.2227144000tttttJulian A. Kapoort2002-12-22tMaNIS georeferencing guidelinestTerrain Navigator 5.03 USGS 1:24,000tunverifiedttttttttttttttttttttttttttttttSciurus carolinensistttttAnimalia; Chordata; Mammalia; Rodentia; SciuridaetAnimaliatChordatatMammaliatRodentiatSciuridaetSciurusttcarolinensisttttttttt455.0ttotal lengtht0t482.9t0tAdulttFttCalifornia Academy of Sciences. CAS Mammalogy (MAM). Source: http://ipt.calacademy.org:8080/ipt/resource.do?r=mam (source published on 2019-07-23)t6ce7290f-47f6-4046-8356-371f5b6749dft66522820-055c-11d8-b84e-b8a03c50a862tmflannery@calacademy.orgtMaureen Flanneryt2019-07-23t2019-08-04tno migratort0t1t0t0t0tspeciment1", "PhysicalObjectt2019-08-25 10:42:36.0thttp://creativecommons.org/publicdomain/zero/1.0/tthttp://vertnet.org/resources/norms.htmltChicago Academy of Sciences. CHAS Mammalogy Collection (Arctos). Record ID: http://arctos.database.museum/guid/CHAS:Mamm:2019.1.74?seid=4307578. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=chas_mammals (source published on 2019-07-07)thttp://arctos.database.museum/guid/CHAS:Mamm:2019.1.74tCHASt113ttCHAStMammal specimensttPreservedSpecimentmask part attribute locationttverbatim preservation date=2017-12-10 ; sex=male ; total length=504 mm; tail length=219 mm; weight=630 g; fat deposition=subcutaneous fat: heavythttp://arctos.database.museum/guid/CHAS:Mamm:2019.1.74?seid=4307578tCHAS:Mamm:2019.1.74ttCollector(s): Steve Sullivan; Preparator(s): Yuqing Wangt1tmalettttttskull; skin, studyttttttpreparator number=YW-02tthttp://arctos.database.museum/guid/CHAS:Mamm:2019.1.74tttttlt;igt;Sciurus carolinensislt;/igt; (accepted ID) identified by Yuqing Wang on 2017-12-10; method: student Remark: Eastern gray squirrel.ttttt2014-11-04tt308t308t2014t11t04tNov 4 2014ttttttttNorth America, United States, Illinois, Cook CountytNorth AmericattttUnited StatesttIllinoistCook CountyttPeggy Notebaert Nature Museum, 2430 North Cannon Drive, ChicagotPNNM, Chicago, Cook, ILtttttttttSteve Sullivantt41.926469t-87.634817tnot recordedt131tttttttttSteve Sullivant2014-11-04tGeoLocatetGeoLocatetunverifiedtttttttttttttttttttttAttYuqing Wangt2017-12-10ttstudenttEastern gray squirrel.tttSciurus carolinensistttttAnimalia; Chordata; Mammalia; Rodentia; Sciuridae; Sciurinae;tAnimaliatChordatatMammaliatRodentiatSciuridaetSciurusttcarolinensisttspeciesttttICZNttt504ttotal lengtht0t630t0ttmalettChicago Academy of Sciences. CHAS Mammalogy Collection (Arctos). Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=chas_mammals (source published on 2019-07-07)t4837d6b0-19fe-4fe4-9ddf-b62dd17a060etf2489500-dbab-4fbc-95ed-19eead127483tdroberts@naturemuseum.orgtDawn Robertst2019-07-07t2019-09-21tno migratort0t0t0t0t0tspeciment1", "PhysicalObjectt2019-09-18 16:45:02.0thttp://creativecommons.org/publicdomain/zero/1.0/tthttp://vertnet.org/resources/norms.htmltChicago Academy of Sciences. CHAS Mammalogy Collection (Arctos). Record ID: http://arctos.database.museum/guid/CHAS:Mamm:3720?seid=4235951. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=chas_mammals (source published on 2019-07-07)thttp://arctos.database.museum/guid/CHAS:Mamm:3720tCHASt113ttCHAStMammal specimensttPreservedSpecimentmask part attribute locationttverbatim collector=E.V. Komarek ; sex=male ; unformatted measurements=TL: 435, T: 202, HF: 60, Et.: 363g ; total length=435 mm; tail length=202 mm; hind foot with claw=60 mm; weight=363 gthttp://arctos.database.museum/guid/CHAS:Mamm:3720?seid=4235951tCHAS:Mamm:3720ttCollector(s): Edwin V. Komarekt1tmalettttttskull; skin, studyttttttUUID=149e86ee-2118-410f-8248-a0d859201335; secondary identifier=Scc-24t"INTERNAL NOTES: collection date listed as January 7, 1936 in Mammal Catalog Book (taped spine), but as "January 1, 1936" in 2nd Mammal Catalog Book, needs to be verified [A.King]. DATA HISTORY: Inventory catalogued/verified by Collections staff (2008-2010 inventory). Record last updated in Excel (prior to Arctos migration) by Dawn R. Roberts (2013-11-30). Date listed as entered in original FileMaker database: 1988-07-06."thttp://arctos.database.museum/guid/CHAS:Mamm:3720tttttlt;igt;Sciurus carolinensislt;/igt; (accepted ID) identified by unknown; method: legacy Remark: Eastern Gray Squirrel.lt;brgt;lt;igt;Sciurus carolinensis carolinensislt;/igt; identified by unknown; method: legacyttttt1936-01-01tt1t1t1936t01t01t[transcribed directly into formatted date fields]ttttttttNorth America, United States, Georgia, Charlton CountytNorth AmericattttUnited StatesttGeorgiatCharlton CountyttChase Prairie, Okefenokee SwamptChase Prairie, Okefenokee Swamp, GeorgiatttttttttEdwin V. KomarektGeoreferenced by John Keating on 16 June 2015 [achinn 20 December 2018].t30.816062t-82.226234tWorld Geodetic System 1984t2435tt30.816062/-82.226234tttdecimal degreesttttEdwin V. Komarekt1936-01-01tGeoLocatetGeoLocatetunverifiedtttttttttttttttttttttAttunknowntttlegacytEastern Gray Squirrel.tttSciurus carolinensistttttAnimalia; Chordata; Mammalia; Rodentia; Sciuridae; Sciurinae;tAnimaliatChordatatMammaliatRodentiatSciuridaetSciurusttcarolinensisttspeciesttttICZNttt435ttotal lengtht0t363t0ttmalettChicago Academy of Sciences. CHAS Mammalogy Collection (Arctos). Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=chas_mammals (source published on 2019-07-07)t4837d6b0-19fe-4fe4-9ddf-b62dd17a060etf2489500-dbab-4fbc-95ed-19eead127483tdroberts@naturemuseum.orgtDawn Robertst2019-07-07t2019-09-21tno migratort0t0t0t0t0tspeciment1", "PhysicalObjectt2018-02-28thttps://creativecommons.org/publicdomain/zero/1.0/tthttp://vertnet.org/resources/norms.htmltCornell University Museum of Vertebrates. CUMV Mammal Collection. Record ID: 67dd2632-537e-11e6-9649-a4a3446a4726. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)thttp://portal.vertnet.org/o/cumv/mamm?id=67dd2632-537e-11e6-9649-a4a3446a4726thttp://grbio.org/cool/i64g-wjcrthttp://grbio.org/cool/67hr-z96tttCUMVtMammttPreservedSpecimenttt{"hind foot length with claw in mm":"62","stomach contents":"Masticated nut meats","tail length in mm":"200","total length in mm":"480","weight":"515.3","ear length from notch":"31"," left gonad length in mm":"26"," left gonad width in mm":"15"," right gonad length in mm":"26"," right gonad width in mm":"15","weight unit":"g" }t67dd2632-537e-11e6-9649-a4a3446a4726t16608ttDon SchofflerttmalettTestes scrotaltttpresenttskeletontttttttttttttttttt1993-03-25tt84t84t1993t3t25t1993-03-25tttttcollecting method: shottttNorth America | United States | New York | | | | |tNorth AmericattttUnited StatestUStNew YorktttSchuyler/Tompkins County, Cauyta Township, ~4.5 km ESE of Cayuta, Arnot ForesttNorth America | United States | New York | Schuyler/Tompkins County, Cauyta Township, ~4.5 km ESE of Cayuta, Arnot Forestttttttttttt42.276659t-76.655259tWGS84t3225tttttttttDBCreatorttttrequires verificationttttttttttttttttttttttttttttttSciurus carolinensistttttAnimalia | Chordata | Mammalia | Rodentia | Sciuridae | SciurustAnimaliatChordatatMammaliatRodentiatSciuridaetSciurusttcarolinensisttspeciesttttICZNttt480ttotal lengtht0t515.3t1ttmalettCornell University Museum of Vertebrates. CUMV Mammal Collection. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)t35720b3e-aded-4b83-b4f1-967f1d457d6atcf9ceb80-9f3d-11da-b791-b8a03c50a862tcbd63@cornell.edutCasey Dillmant2018-07-02t2018-07-03t2018-01-08t0t0t0t0t0tspeciment1", "PhysicalObjectt2018-02-28thttps://creativecommons.org/publicdomain/zero/1.0/tthttp://vertnet.org/resources/norms.htmltCornell University Museum of Vertebrates. CUMV Mammal Collection. Record ID: 6806d370-537e-11e6-9649-a4a3446a4726. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)thttp://portal.vertnet.org/o/cumv/mamm?id=6806d370-537e-11e6-9649-a4a3446a4726thttp://grbio.org/cool/i64g-wjcrthttp://grbio.org/cool/67hr-z96tttCUMVtMammttPreservedSpecimenttt{"hind foot length with claw in mm":"59","tail length in mm":"163","total length in mm":"440","weight":"583.5","ear length from notch":"26","weight unit":"g" }t6806d370-537e-11e6-9649-a4a3446a4726t21304ttEdward S. ThomasttmaletadulttTestes scrotal 22 x 12 mmtttpresenttstudy skin - 1; skeleton - 1; tissue (frozen) - 1tttttttttttttttttt2010-04-06tt96t96t2010t4t6t2010-04-06ttttttttNorth America | United States | New York | Tompkins County | | | |tNorth AmericattttUnited StatestUStNew YorktTompkinsttDryden Township, Ellis HollowtNorth America | United States | New York | Tompkins County | Dryden Township, Ellis Hollowttttttttttt42.42766t-76.38849tWGS84t2078tttttttttDBCreatort2007-06-13tttrequires verificationttttttttttttttttttttttttttttttSciurus carolinensistttttAnimalia | Chordata | Mammalia | Rodentia | Sciuridae | SciurustAnimaliatChordatatMammaliatRodentiatSciuridaetSciurusttcarolinensisttspeciesttttICZNttt440ttotal lengtht0t583.5t1tadulttmalettCornell University Museum of Vertebrates. CUMV Mammal Collection. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)t35720b3e-aded-4b83-b4f1-967f1d457d6atcf9ceb80-9f3d-11da-b791-b8a03c50a862tcbd63@cornell.edutCasey Dillmant2018-07-02t2018-07-03t2018-01-08t0t1t0t0t0tspeciment1", "PhysicalObjectt2018-02-28thttps://creativecommons.org/publicdomain/zero/1.0/tthttp://vertnet.org/resources/norms.htmltCornell University Museum of Vertebrates. CUMV Mammal Collection. Record ID: 6810ae67-537e-11e6-9649-a4a3446a4726. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)thttp://portal.vertnet.org/o/cumv/mamm?id=6810ae67-537e-11e6-9649-a4a3446a4726thttp://grbio.org/cool/i64g-wjcrthttp://grbio.org/cool/67hr-z96tttCUMVtMammttPreservedSpecimenttt{"hind foot length with claw in mm":"64","tail length in mm":"230","total length in mm":"505","weight":"580","weight unit":"g" }t6810ae67-537e-11e6-9649-a4a3446a4726t1832ttWilliam J. Hamilton Jr.ttmaletadulttTestes enlarged amp; descendedtttpresenttstudy skin - 1; skull - 1tttttttRec'd from W.J. Hamilton, Jr.ttttttttttt1938-10-05tt278t278t1938t10t5t1938-10-05tttttcollecting method: killed by cartttNorth America | United States | New York | Tompkins County | | | |tNorth AmericattttUnited StatestUStNew YorktTompkinsttNewfieldtNorth America | United States | New York | Tompkins County | Newfieldttttttttttt42.362018t-76.590778tnot recorded (forced WGS84)t3036tttttttttttttrequires verificationttttttttttttttttttttttttttttttSciurus carolinensistttttAnimalia | Chordata | Mammalia | Rodentia | Sciuridae | SciurustAnimaliatChordatatMammaliatRodentiatSciuridaetSciurusttcarolinensisttspeciesttttICZNttt505ttotal lengtht0t580t1tadulttmalettCornell University Museum of Vertebrates. CUMV Mammal Collection. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)t35720b3e-aded-4b83-b4f1-967f1d457d6atcf9ceb80-9f3d-11da-b791-b8a03c50a862tcbd63@cornell.edutCasey Dillmant2018-07-02t2018-07-03t2018-01-08t0t0t0t0t0tspeciment1", "PhysicalObjectt2018-03-20thttps://creativecommons.org/publicdomain/zero/1.0/tthttp://vertnet.org/resources/norms.htmltCornell University Museum of Vertebrates. CUMV Mammal Collection. Record ID: 681424b0-537e-11e6-9649-a4a3446a4726. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)thttp://portal.vertnet.org/o/cumv/mamm?id=681424b0-537e-11e6-9649-a4a3446a4726thttp://grbio.org/cool/i64g-wjcrthttp://grbio.org/cool/67hr-z96tttCUMVtMammttPreservedSpecimenttt{"hind foot length with claw in mm":"71","tail length in mm":"208","total length in mm":"476","weight":"455","weight unit":"g" }t681424b0-537e-11e6-9649-a4a3446a4726t3842tWJHJ 2460tWilliam J. Hamilton Jr.ttmaletadulttTestes small, not descended.tttpresenttstudy skin - 1; skull - 1tttttttSkull broken but saved. Fleas preserved.; Received from W.J. Hamilton, Jr.ttttttttttt1947-01-28tt28t28t1947t1t28t1947-01-28ttttttttNorth America | United States | New York | Tompkins County | | | |tNorth AmericattttUnited StatestUStNew YorktTompkinsttIthaca Township, Cayuga Heights, Highland RoadtNorth America | United States | New York | Tompkins County | Ithaca Township, Cayuga Heights, Highland Roadttttttttttt42.465699t-76.490741tWGS84t1488tttttttttDBCreatort2009-08-26tttrequires verificationttttttttttttttttttttttttttttttSciurus carolinensis leucotistttttAnimalia | Chordata | Mammalia | Rodentia | Sciuridae | SciurustAnimaliatChordatatMammaliatRodentiatSciuridaetSciurusttcarolinensistleucotistsubspeciesttttICZNttt476ttotal lengtht0t455t1tadulttmalettCornell University Museum of Vertebrates. CUMV Mammal Collection. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)t35720b3e-aded-4b83-b4f1-967f1d457d6atcf9ceb80-9f3d-11da-b791-b8a03c50a862tcbd63@cornell.edutCasey Dillmant2018-07-02t2018-07-03t2018-01-08t0t0t0t0t0tspeciment1", "PhysicalObjectt2018-02-28thttps://creativecommons.org/publicdomain/zero/1.0/tthttp://vertnet.org/resources/norms.htmltCornell University Museum of Vertebrates. CUMV Mammal Collection. Record ID: 68340671-537e-11e6-9649-a4a3446a4726. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)thttp://portal.vertnet.org/o/cumv/mamm?id=68340671-537e-11e6-9649-a4a3446a4726thttp://grbio.org/cool/i64g-wjcrthttp://grbio.org/cool/67hr-z96tttCUMVtMammttPreservedSpecimenttt{"hind foot length with claw in mm":"63","tail length in mm":"210","total length in mm":"405","weight":"557","ear length from notch":"18"," left gonad length in mm":"25"," left gonad width in mm":"15"," right gonad length in mm":"32"," right gonad width in mm":"15","weight unit":"g" }t68340671-537e-11e6-9649-a4a3446a4726t17452ttDon Schofflerttmaletttttpresenttskeletontttttttttttttttttt1993-03-25tt84t84t1993t3t25t1993-03-25tttttcollecting method: shottttNorth America | United States | New York | | | | |tNorth AmericattttUnited StatestUStNew YorktttSchuyler/Tompkins County, Cauyta Township, ~4.5 km ESE of Cayuta, Arnot ForesttNorth America | United States | New York | Schuyler/Tompkins County, Cauyta Township, ~4.5 km ESE of Cayuta, Arnot Forestttttttttttt42.276659t-76.655259tWGS84t3225tttttttttDBCreatorttttrequires verificationttttttttttttttttttttttttttttttSciurus carolinensistttttAnimalia | Chordata | Mammalia | Rodentia | Sciuridae | SciurustAnimaliatChordatatMammaliatRodentiatSciuridaetSciurusttcarolinensisttspeciesttttICZNttt405ttotal lengtht0t557t1ttmalettCornell University Museum of Vertebrates. CUMV Mammal Collection. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)t35720b3e-aded-4b83-b4f1-967f1d457d6atcf9ceb80-9f3d-11da-b791-b8a03c50a862tcbd63@cornell.edutCasey Dillmant2018-07-02t2018-07-03t2018-01-08t0t0t0t0t0tspeciment1", "PhysicalObjectt2018-02-28thttps://creativecommons.org/publicdomain/zero/1.0/tthttp://vertnet.org/resources/norms.htmltCornell University Museum of Vertebrates. CUMV Mammal Collection. Record ID: 6834ab99-537e-11e6-9649-a4a3446a4726. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)thttp://portal.vertnet.org/o/cumv/mamm?id=6834ab99-537e-11e6-9649-a4a3446a4726thttp://grbio.org/cool/i64g-wjcrthttp://grbio.org/cool/67hr-z96tttCUMVtMammttPreservedSpecimenttt{"hind foot length with claw in mm":"62","tail length in mm":"220","total length in mm":"465","weight":"536","ear length from notch":"32","weight unit":"g" }t6834ab99-537e-11e6-9649-a4a3446a4726t18586ttDora E. Worbsttfemaletttttpresenttstudy skin - 1; skeleton - 1tttttttttttttttttt1995-11-02tt306t306t1995t11t2t1995-11-02tttttcollecting method: killed by cattttNorth America | United States | New York | Tompkins County | | | |tNorth AmericattttUnited StatestUStNew YorktTompkinsttCaroline Township, White Church amp; Ridgeway Roads, ~4.3 km S BrooktondaletNorth America | United States | New York | Tompkins County | Caroline Township, White Church amp; Ridgeway Roads, ~4.3 km S Brooktondalettttttttttt42.3443t-76.3857tWGS84t112tttttttttDBCreatort2009-01-26tttrequires verificationttttttttttttttttttttttttttttttSciurus carolinensistttttAnimalia | Chordata | Mammalia | Rodentia | Sciuridae | SciurustAnimaliatChordatatMammaliatRodentiatSciuridaetSciurusttcarolinensisttspeciesttttICZNttt465ttotal lengtht0t536t1ttfemalettCornell University Museum of Vertebrates. CUMV Mammal Collection. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)t35720b3e-aded-4b83-b4f1-967f1d457d6atcf9ceb80-9f3d-11da-b791-b8a03c50a862tcbd63@cornell.edutCasey Dillmant2018-07-02t2018-07-03t2018-01-08t0t0t0t0t0tspeciment1", "PhysicalObjectt2018-02-28thttps://creativecommons.org/publicdomain/zero/1.0/tthttp://vertnet.org/resources/norms.htmltCornell University Museum of Vertebrates. CUMV Mammal Collection. Record ID: 68384b01-537e-11e6-9649-a4a3446a4726. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)thttp://portal.vertnet.org/o/cumv/mamm?id=68384b01-537e-11e6-9649-a4a3446a4726thttp://grbio.org/cool/i64g-wjcrthttp://grbio.org/cool/67hr-z96tttCUMVtMammttPreservedSpecimenttt{"hind foot length with claw in mm":"70","stomach contents":"Empty","tail length in mm":"233","total length in mm":"475","weight":"512","ear length from notch":"31","weight unit":"g" }t68384b01-537e-11e6-9649-a4a3446a4726t17444ttunknownttfemalettShrunken, abdominaltttpresenttskeletontttttttttttttttttt1994-03-23tt82t82t1994t3t23t1994-03-23tttttcollecting method: shottttNorth America | United States | New York | | | | |tNorth AmericattttUnited StatestUStNew YorktttSchuyler/Tompkins County, Cauyta Township, ~4.5 km ESE of Cayuta, Arnot ForesttNorth America | United States | New York | Schuyler/Tompkins County, Cauyta Township, ~4.5 km ESE of Cayuta, Arnot Forestttttttttttt42.276659t-76.655259tWGS84t3225tttttttttDBCreatorttttrequires verificationttttttttttttttttttttttttttttttSciurus carolinensistttttAnimalia | Chordata | Mammalia | Rodentia | Sciuridae | SciurustAnimaliatChordatatMammaliatRodentiatSciuridaetSciurusttcarolinensisttspeciesttttICZNttt475ttotal lengtht0t512t1ttfemalettCornell University Museum of Vertebrates. CUMV Mammal Collection. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)t35720b3e-aded-4b83-b4f1-967f1d457d6atcf9ceb80-9f3d-11da-b791-b8a03c50a862tcbd63@cornell.edutCasey Dillmant2018-07-02t2018-07-03t2018-01-08t0t0t0t0t0tspeciment1")  

Ниже приведен код, который я использовал для загрузки файла .txt и создания фрейма данных из 13 столбцов.

 setwd()  install.packages("tidyverse") library(tidyverse) install.packages("lubridate") library(lubridate) install.packages("cowplot") library(cowplot)  Sciurus_carolinensis_total lt;- readr::read_tsv("Sciurus_carolinensis_total.txt") %gt;% select(genus, specificepithet, sex, year, month, day, countrycode, stateprovince, county, decimallatitude, decimallongitude, lengthinmm, lengthtype)  

Спасибо вам за помощь. Это было гораздо более простое решение, чем я думал.

Комментарии:

1. Я предложил правку, которая (imo) делает этот вопрос намного менее сложным: длинные длинные вопросы из-за подробных описаний (или бессвязности) — это одно, но длинные неверно отформатированные данные-это другое. Код-это не единственное, что может (и должно) входить в блоки кода, в них также неплохо работают данные.

2. Если это фактический (необработанный) формат файла, то я не думаю, что готовый анализатор будет хорошо работать. Там нет четкого разделителя (кроме пустого места переменной), поэтому проблема с отсутствием в первой строке данных измененной метки времени, поскольку мы не можем авторитетно утверждать, что она отсутствует (без сопоставления с образцом и т. Д. Если VertNet имеет более четко определенный формат (например, json, xml, csv, tsv), то, возможно, будет лучше работать с ними, чем с этим. (Также вполне возможно, что сайт стека здесь маскирует вкладки под пробелы, что может помочь. Для этого нам может потребоваться ваше разъяснение.)

3. @r2evans Спасибо вам за предложенное редактирование моего поста. Я обязательно сделаю это в будущем.

4. Данные, загруженные из VertNet, загружаются только в виде текстового документа, который, как я полагаю, разделен вкладками. Это помогает? Нет ли способа загрузить текстовый документ на этот сайт для анализа, не так ли?

5. Это так, и теперь, когда я смотрю больше, я вижу, что интерфейс стека немного искажает ситуацию: когда я редактирую ваш вопрос, чтобы просмотреть необработанные данные, кажется, что в нем есть вкладки, но в HTML это не так. Я не думаю, что реалистично ожидать, что пользователям придется редактировать ваш вопрос, чтобы получить данные, поэтому я просто предложил отредактировать большинство (12 из 15) ваших необработанных строк и все еще оставаться в пределах 30-тысячного лимита символов стека.