Ошибка индекса Python При использовании CSV - файла для создания диаграммы Венна

#python #csv

Вопрос:

Я пытаюсь создать диаграмму Венна, используя информацию из созданного мной CSV-файла. Однако в строке 13 моего кода я получаю ошибку индекса, в которой говорится, что индекс списка находится вне диапазона. Я надеялся узнать, может ли у кого-нибудь здесь быть какие-либо идеи о том, почему это так. Это код, который я использую, чтобы попытаться создать диаграмму Венна:

 from matplotlib_venn import venn2 import matplotlib.pyplot as plt import csv from sympy import FiniteSet  def get_Sports(file_name):  football=[]  others=[]  with open(file_name) as file:  reader = csv.reader(file)  next(reader)  for row in reader:  if row[1] == 1:  football.append(row[0])  if row[2] == 1:  others.append(row[0])  return football, others  def plot_venn(f, o):  venn2(subsets=(f, o))  plt.show()  if __name__ == '__main__':  file = input('Input the file path with the list of student ids, whether they play football, and whether they play other sports: ')  football, others = get_Sports(file)  football = FiniteSet(*football)  others = FiniteSet(*others)  plot_venn(football, others)

Вот файл CSV, который я использую:

 ID, Football, Other Sports 1, 1, 0 2, 0, 1 3, 0, 0 4, 1, 1 5, 1, 0 6, 0, 0 7, 0, 0 8, 1, 1 9, 1, 1 10, 1, 0

1. Значения, которые вы получаете от csv.reader in row , являются строками. Ваш код сравнивает их с целыми числами. Таким if образом, тесты оба проваливаются, а append() вызовы никогда не происходят.

2. пожалуйста, скопируйте все сообщение об ошибке в свой вопрос в виде кода

Ответ №1:

В вашей логике есть ошибка, так как при чтении csv значение сохраняется в виде строки. Так row[1] == 1 будет False , потому row[1] что есть '1' .

Чтобы ваш код работал, ваша функция должна быть:

 def get_Sports(file_name):  football=[]  others=[]  with open(file_name) as file:  reader = csv.reader(file)  next(reader)  for row in reader:  if row[1].strip() == '1':  football.append(row[0])  if row[2].strip() == '1':  others.append(row[0])  return football, others

Лично мне pandas это скорее нравится csv . Поэтому я сделал то же самое, что и вы, просто использовал панд, а затем использовал логику, чтобы в конечном итоге получить то, что вам нужно. Это было бы лучше/быстрее, если ваш csv-файл огромен:

 from matplotlib_venn import venn2 import matplotlib.pyplot as plt import pandas as pd from sympy import FiniteSet  def get_Sports(file_name):  df = pd.read_csv(file_name)  football = list(df[df.iloc[:,1] == 1].iloc[:,0])  others = list(df[df.iloc[:,-1] == 1].iloc[:,0])    return football, others  def plot_venn(f, o):  venn2(subsets=(f, o), set_labels = ('Football', 'Other Sports'))  plt.show()  if __name__ == '__main__':  file = input('Input the file path with the list of student ids, whether they play football, and whether they play other sports: ')  football, others = get_Sports(file)  football = FiniteSet(*football)  others = FiniteSet(*others)  plot_venn(football, others)

Выход: