Как удалить строки, содержащие следующие расширения с шаблоном регулярных выражений?

#windows #powershell #csv

Вопрос:

Я хочу удалить строку на основе столбца «Сценарий или ожидаемые файлы», этот столбец содержит либо слово «технический», либо пустой, либо расширения. Удаление выполняется только в том случае, если столбец пуст или содержит следующие расширения: .txt_go
.zip
.prd
.xml
.go
.csv .txt или .xlsx
или содержит _00* ??????????????

это csv-файл:

 Jobstream,"Jobstream Description","Op num","Job","Script or expected file(s)","Server","user","location","Job Description"

Jobstream,"Jobstream Description","Op num","Job","Script or expected file(s)","Server","user","location","Job Description"
ACTOdataoutPACTO500f_ref12.prd,"WAIT TRGFIC-ACTO001","9","","ACTOdataoutPACTO500f_ref12.prd","","","","START"
ACTOTRGAAA,"WAIT TRGFIC-ACTO001","10","PACTOAAA","technical","","","","ADDJOBSTREAM"
prodFACIMPORT*.xml,"WAIT TRGFIC-ACTO001","9","","prodFACIMPORT*.xml","","","","START"
ACTOTRGAAB,"WAIT TRGFIC-ACTO001","10","PACTOAAB","technical","","","","ADDJOBSTREAM"
prodTREATIESDECLARATIONIMPORT*.xml,"WAIT TRGFIC-ACTO002","9","","prodTREATIESDECLARATIONIMPORT*.xml","","","","START"
ACTOTRGAAC,"WAIT TRGFIC-ACTO002","10","PACTOAAD","technical","","","","ADDJOBSTREAM"
prodACTOdatainPACTO003*_desc.xml,"WAIT TRGFIC-ACTO560","9","","prodACTOdatainPACTO003*_desc.xml","","","","START"
ACTOTRGAAD,"WAIT TRGFIC-ACTO560","10","PACTOAAE","technical","","","","ADDJOBSTREAM"
RODACTOdataarchivesf_ref12.prd.xml_????????_??????,"WAIT TRGFIC-ACTO999","9","","RODACTOdataarchivesf_ref12.prd.xml_????????_??????","","","","START"
REINSURANCE_DATAclient1-Sources*.xlsx,"WAIT TRGFIC-SHIN001","9","","REINSURANCE_DATAclient1-Sources*.xlsx","","","","START"
SHINTRGAAA,"WAIT TRGFIC-SHIN001","10","PSHINAAB","technical","","","","ADDJOBSTREAM"
prodSHINdatainPSHIN004*.zip,"WAIT TRGFIC-SHIN003","9","","prodSHINdatainPSHIN004*.zip","","","","START"
prodAGPCWEBX.001flgtrt.go,"WAIT TRGFIC-WEBX001","9","","prodAGPCWEBX.001flgtrt.go","","","","START"
WEBXTRGAAA,"WAIT TRGFIC-WEBX001","10","PWEBXAAC","technical","","","","ADDJOBSTREAM"
prodAGPCWEBX.002inPRTCP.csv,"Run Participations ACTOR","9","","prodAGPCWEBX.002inPRTCP.csv","","","","START"
WEBXTRGAAB,"Run Participations ACTOR","10","PWEBXAAD","technical","","","","ADDJOBSTREAM"
prodAGPCCOPERNIC_LILWEBXL_COPinEC_AXACES.csv,"WAIT TRGFIC-WEBXWX2","9","","prodAGPCCOPERNIC_LILWEBXL_COPinEC_AXACES.csv","","","","START"
WEBXTRGAAC,"WAIT TRGFIC-WEBXWX2","10","PWEBXAAE","technical","","","","ADDJOBSTREAM"
prodWEBXdatainPWEBXWX1LIL_AH_I100_WX101_00*,"WAIT TRGFIC-WEBX224","9","","prodWEBXdatainPWEBXWX1LIL_AH_I100_WX101_00*","","","","START"
WRPT100Q,"REPORTXL","40","PWRPTTAG","technical","","","","Envoi mail utilisateurs"
WRPT-100Q-005T,"REPORTXL","45","PWRPT0B4","PWRPT-100Q-005T.BAT","PRAXCAPP02","AXA-CESSIONSSVC_SCHEDULING","F WRPT-007","Export (DataPump) AGPC"
WRPT-100Q-015T,"REPORTXL","55","PWRPT0B6","PWRPT-100Q-015T.KSH","PRATFUDMGTW01","svcudmu","F WRPT-004","Transfert Fichiers DUMP"
WRPT-100Q-025T,"REPORTXL","75","PWRPT0CA","PWRPT-100Q-025T.BAT","PRAXCAPP02","AXA-CESSIONSSVC_SCHEDULING","F WRPT-007","Import (DataPump) AGPC"

 

Комментарии:

1. Это, похоже, просто список строк, а не CSV

2. Также похоже, что вы хотели бы удалить все, кроме последней строки.. Не было бы проще тогда описать, что вам нужно сохранить ?

3. Итак, вы хотите удалить все записи/строки из CSV, где Script or expected file(s) столбец заканчивается любым из этих расширений или содержит «_00», а затем записать оставшиеся строки обратно в новый CSV? Правильно ли это понято?

4. Пожалуйста, Матиас Р., да

Ответ №1:

Используется Where-Object для фильтрации ваших данных:

 # import data 
$data = Import-Csv .pathtofile.csv

# define list of extensions to filter out
$excludedExtensions = -split @'
.txt_go
.zip
.prd
.xml
.go
.csv
.txt
.xlsx
'@

# filter data 
$data |Where-Object {
  foreach($extension in $excludedExtensions){
    if($_.'Script or expected file(s)' -like "*$extension"){
      # immediately return $false and filter out row if ANY extension matches
      return $false
    }
  }

  # finally check for *_00* and return $true if not found
  return $_.'Script or expected file(s)' -notlike '*_00*'
} |Export-Csv .pathtooutput.csv -NoTypeInformation
 

Комментарии:

1. Большое вам спасибо, для меня это прекрасно работает.

2. @Amakouladji Отлично, добро пожаловать! Если этот ответ решит вашу проблему, пожалуйста, подумайте о том, чтобы отметить его «принято», нажав на галочку слева