#python #pandas #csv

Вопрос:

Я довольно новичок в разработке на любой платформе. Пытаюсь освоить основы Python — Панд. При попытке попрактиковаться в функции группирования панд я получаю дубликаты записей. Пожалуйста, ознакомьтесь с данными, вопросами и кодом, которые я пробовал. Буду признателен за любые предложения по этому же вопросу.

прочитайте game.csv, game_score.csv

game.csv —

  id,url,genre,editors_choice,release_year,release_month,release_day
 0,/games/littlebigplanet-vita/vita-98907,Platformer,Y,2012,9,12
 1,/games/littlebigplanet-ps-vita-marvel-super-hero-edition/vita-20027059,Platformer,Y,2012,9,12
 2,/games/splice/ipad-141070,Puzzle,N,2012,9,12
 3,/games/nhl-13/xbox-360-128182,Sports,N,2012,9,11
 4,/games/nhl-13/ps3-128181,Sports,N,2012,9,11
 5,/games/total-war-battles-shogun/mac-     142565,Strategy,N,2012,9,11
 6,/games/double-dragon-neon/xbox-360-     131320,Fighting,N,2012,9,11
 7,/games/guild-wars-2/pc-896298,RPG,Y,2012,9,11
 8,/games/double-dragon-neon/ps3-131321,Fighting,N,2012,9,11
 9,/games/total-war-battles-shogun/pc-142564,Strategy,N,2012,9,11
 10,/games/tekken-tag-tournament-2/ps3-124584,Fighting,N,2012,9,11

game_score.csv

 id,score_phrase,title,platform,score                                  
 0,Painful,The History Channel: Battle for the Pacific,Wii,2.5                                   
 1,Awful,The History Channel: Battle For the Pacific,PlayStation 2,3                                   
 2,Bad,The History Channel: Battle For The Pacific,PC,4.9                                   
 3,Bad,The History Channel: Battle For the Pacific,Xbox 360,4.5                                   
 4,Bad,The History Channel: Battle For the Pacific,PlayStation 3,4.5                                   
 5,Awful,Hail to the Chimp,Xbox 360,3.5                                   
 6,Awful,Hail To The Chimp,PlayStation 3,3.5                                   
 7,Okay,Spyro: Enter The Dragonfly,PlayStation 2,6                                   
 8,Okay,Spyro: Enter the Dragonfly,GameCube,6                                   
 9,Okay,007 Legends,PlayStation 2,4                                                                      
 10,Okay,007 Racing,GameCube,5

объединение 2 csv-файлов на основе «идентификатора»
Найдите средний балл каждой игры, используя groupby
отсортируйте значения в порядке убывания, чтобы определить ранг
сохраните файл в файл o/p csv
o/p csv файл содержит столбцы заголовок, оценка
не включайте заголовок при записи o/p csv-файла

Мой Код —

  import pandas as pd 
 file_game = pd.read_csv('game.csv')                                
 file_game_score = pd.read_csv('game_score.csv')                      
 merged_game_file = pd.merge(file_game, file_game_score,      on='id')      
 final_data = merged_game_file[['title', 'score']]      
 mean_df = final_data.groupby('title').mean()      
 final_df = mean_df['score'].rank(ascending=0)      
 print(final_df)

O/P — final_df

  007 Legends,4.5      
 007 Racing,7.0      
 Hail To The Chimp,2.5      
 Hail to the Chimp,2.5      
 Spyro: Enter The Dragonfly,8.5      
 Spyro: Enter the Dragonfly,8.5      
 The History Channel: Battle For The Pacific,6.0
 The History Channel: Battle For the Pacific,4.5
 The History Channel: Battle for the Pacific,1.0

Ответ №1:

Вот одна идея…

Попробуй:

 import pandas as pd 
file_game = pd.read_csv('game.csv')                                
file_game_score = pd.read_csv('game_score.csv') 



# make '...Battle For the Pacific' and '...Battle For The Pacific' the same
file_game_score['title'] = file_game_score['title'].str.lower()



merged_game_file = pd.merge(file_game, file_game_score, on='id')      
final_data = merged_game_file[['title', 'score']]      
mean_df = final_data.groupby('title').mean()      
final_df = mean_df['score'].rank(ascending=0)      
print(final_df)

Выходы:

 title
007 legends                                    3.0
007 racing                                     2.0
hail to the chimp                              5.0
spyro: enter the dragonfly                     1.0
the history channel: battle for the pacific    4.0

1. Спасибо @MDR. Я научился этому на собственном горьком опыте. Развернутый лист не чувствителен к регистру там, где Панды чувствительны к регистру. Большое спасибо за ваше время и помощь. 👍🏽👍🏽👍🏽

Панды: Ранжируйте игры в соответствии со счетом

Вопрос:

O/P — final_df

Ответ №1:

Комментарии:

Вопрос:

O/P — final_df

Ответ №1:

Комментарии:

Вам также может понравиться

Результат пуст при прогнозировании более быстрой модели RCNN (Pytorch)

Почему это не работает?: декодирование json в массив php

Получение массива X из многомерного массива