#python #nlp #gensim #summarization #summarize
#python #nlp #gensim #обобщение #обобщить
Вопрос:
Я получаю повторяющиеся строки в выводе моего обобщителя. Я использую genism в python для обобщения текстовых документов. Как удалить повторяющиеся строки из выходных данных сумматора. На выходе получается повторяющееся содержимое. Как я могу сохранить только уникальные строки в выходных данных из сумматора.Входной файл выглядит следующим образом
From: Jos
To: Halley, Ibizo /FR
Cc: pqr Secretariat; Björnsson Ulrika
Subject: [EXTERNAL] pqr Response to Letter of Intent for a Variation WS procedure:SE/H/xxxx/WS/
Date: vendredi 1 juin 2018 13:16:48
Attachments: image001.jpg
A07_SE_xxx yy Ramp;D.PDF
Dear Ibizo,
Thank you for your letter of intent.
The pqr agrees, on the basis of the documentation provided, that the above mentioned work-
sharing application as specified in the enclosed letter of intent is acceptable for submission under
Article 20 of the Commission Regulation (EC) No 1234/2008 of 24 November 2008.
The reference authority for the worksharing procedure will be Sweden and the assigned work sharing
procedure number will be:
A07: SE/H/xxxx/WS/
Please be advised that this confirmation is not to be considered as validation of your application. The
validity of the worksharing application will be checked by the reference authority after submission.
Please liaise with the assigned reference authority for the further proceedings.
Kind regards,
Joe
Assistant Administrator
Parallel Distribution amp; Certificates
Committees amp; Inspections Department
Panthers Medicines Agency
30 ABC St, Michigan lane
Fax 44 (0)20 certificate@zz.europa.eu | www.zz.europa.eu
This message and any attachment contain information which may be confidential or otherwise
protected from disclosure. It is intended for the addressee(s) only and should not be relied upon as
legal advice unless it is otherwise stated. If you are not the intended recipient(s) (or authorised by
an addressee who received this message), access to this e-mail, or any disclosure or copying of its
contents, or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If
you have received this e-mail in error, please inform the sender immediately.
P Please consider the environment and don't print this e-mail unless you really need to
From: Jos
Sent: 30 April 2018 11:17
To: Ibizo.Halley@xxx.com
Cc: pqr Secretariat
Subject: RE: Alfuzosin Hydrochloride - Request for Worksharing procedure
Dear Ibizo,
Thank you for your zzil.
The letter of intent will be discussed in the May 2018 pqr meeting and you will receive feedback
within two weeks following the meeting.
Kind regards,
Joe
Assistant Administrator
Parallel Distribution amp; Certificates
Committees amp; Inspections Department
mailto:eretta.ab@zz.europa.eu
mailto:Ibizo.Halley@xxx.com
mailto:H-pqrSecretariat@zz.europa.eu
mailto:Ulrika.Bjornsson@mpa.se
mailto:certificate@zz.europa.eu
pqr/162/2010/Rev.2, August 2014
26 April 2018
pqr Secretariat
Panthers Medicines Agency
30 Bluegoon Place, ABC Wharf
ABC E14 5EU
United Kingdom
Subject: Letter of intent for the submission of a worksharing procedure to the pqr according
to Article 20 of Commission Regulation (EC) No 1234/2008
Worksharing Applicant details:
Name : xxx-yy Ramp;D
Address : 1, lane Pierre Brossolette
91385 Chilly-Maz
Sw
Contact person details
(i.e. name, address, e-mail
address, phone number, fax
number)
: Ibizo Halley
1, lane Pierre Brossolette
91385 Chilly-Maz
Sw
zzil: Ibizo.halley@xxx.com
Tel : 33 1 60 49 51 61
Application details:
This letter of intent for the submission of a Type II following a worksharing procedure according to
Article 20 of Commission Regulation (EC) No 1234/2008, concerns the following medicinal products
authorised via MRP and national procedures:
Products authorized via MRP:
Alfuzosin 2.5 mg film-coated tablets
Product name Active
substance(s)
MRP number
XATRAL Alfuzosin
hydrochloride
SE/H/0112/001
mailto:Ibizo.halley@xxx.com
Alfuzosin 5 mg prolonged-release tablets
Product name Active
substance(s)
MRP number
XATRAL SR 5 MG Alfuzosin
hydrochloride
SE/H/0112/002
XATRAL Alfuzosin
hydrochloride
SE/H/0112/002
Alfuzosin 10 mg prolonged-release tablets
Product name Active
substance(s)
MRP number
XATRAL UNO 10 MG Alfuzosin
hydrochloride
SE/H/0112/003
ALFUZOSIN WINTHROP
UNO 10 MG
Alfuzosin
hydrochloride
DE/H/2130/001
ALFUZOSIN ZENTIVA 10
MG
Alfuzosin
hydrochloride
DE/H/2131/001/MR
UROXATRAL Alfuzosin
hydrochloride
DE/H/2129/001
Alfuzosin Zentiva 10 mg
Retardtabletten
Alfuzosin
hydrochloride
DE/H/2131/001
XATRAL OD 10 MG Alfuzosin
hydrochloride
SE/H/0112/003
Products authorised via national procedure:
Alfuzosin 2.5 mg film-coated tablets
Product name Active
substance(s)
National MA
number
Member state
XATRAL Alfuzosin
hydrochloride
NO APPLICATION
CODE -#10600
Denmark
XATRAL 2.5 MG Alfuzosin
hydrochloride
NL 14785 France
ALFUZOSIN
WINTHROP 2.5 MG
Alfuzosin
hydrochloride
32177.00.00 Germany
UROXATRAL Alfuzosin
hydrochloride
18111.00.00 Germany
XATRAL Alfuzosin
hydrochloride
NO APPLICATION
CODE -#10602
Greece
XATRAL 2.5 MG Alfuzosin
hydrochloride
PA 540/162/1 Ireland
XATRAL Alfuzosin
hydrochloride
027314018 Italy
MITTOVAL Alfuzosin
hydrochloride
026670024 Italy
ALFUZOSINA
ZENTIVA
Alfuzosin
hydrochloride
NO APPLICATION
CODE -#10163
Italy
XATRAL Alfuzosin
hydrochloride
RVG 13689 Netherlands
DALFAZ Alfuzosin
hydrochloride
R/6812 Poland
BENESTAN 2.5 MG Alfuzosin
hydrochloride
60031 Spain
XATRAL 2.5 MG Alfuzosin
hydrochloride
PL 04425/0655 United Kingdom
ALFUZOSIN
HYDROCHLORIDE
2.5MG
Alfuzosin
hydrochloride
PL 17780/0220 United Kingdom
Alfuzosin 5 mg prolonged-release tablets
Product name Active
substance(s)
National MA
number
Member state
XATRAL 5 RETARD Alfuzosin
hydrochloride
NAT-H-4908-01 Belgium
XATRAL Alfuzosin
hydrochloride
17139
Cyprus
XATRAL LP 5 MG Alfuzosin
hydrochloride
NL 19090 France
ALFUZOSIN
WINTHROP 5 MG
Alfuzosin
hydrochloride
34637.00.00 Germany
XATRAL Alfuzosin
hydrochloride
NO APPLICATION
CODE -#10812
Greece
ALFETIM SR 5 MG Alfuzosin
hydrochloride
OGYI-T-4374/01 Hungary
ALFUZOSINA
ZENTIVA
Alfuzosin
hydrochloride
NO APPLICATION
CODE -#8994
Italy
XATRAL 5 RETARD Alfuzosin
hydrochloride
583/98/12/4785 Luxembourg
XATRAL SR 5 MG Alfuzosin
hydrochloride
MA082/05001 Malta
DALFAZ SR Alfuzosin
hydrochloride
8127 Poland
XATRAL LP 5 MG Alfuzosin
hydrochloride
1026/2008 Romania
XATRAL 5-SR Alfuzosin
hydrochloride
77/0275/96-S Slovakia
BENESTAN
RETARD 5 MG
Alfuzosin
hydrochloride
60767 Spain
Alfuzosin 10 mg prolonged-release tablets
Product name Active
substance(s)
National MA
number
Member state
XATRAL UNO
10 MG
Alfuzosin
hydrochloride
NAT-H-4908-04 Belgium
XATRAL XL 10 MG Alfuzosin
hydrochloride
19244 Cyprus
XATRAL SR 10 MG Alfuzosin
hydrochloride
345201 Estonia
XATRAL CR 10 MG Alfuzosin
hydrochloride
13973 Finland
ALFUZOSINE
ZENTIVA LP 10 MG
Alfuzosin
hydrochloride
NL 24407 France
XATRAL LP 10 MG Alfuzosin
hydrochloride
NL 24386 France
XATRAL OD Alfuzosin
hydrochloride
NO APPLICATION
CODE -#9520
Greece
ALFETIM UNO
10 MG
Alfuzosin
hydrochloride
OGYI-T-8022/01 Hungary
XATRAL 10 MG Alfuzosin
hydrochloride
PA 540/162/3 Ireland
MITTOVAL Alfuzosin
hydrochloride
026670048-051 Italy
XATRAL 10 MG Alfuzosin
hydrochloride
027314044-057 Italy
ALFUZOSINA
ZENTIVA
Alfuzosin
hydrochloride
NO APPLICATION
CODE -#9579
Italy
XATRAL SR 10 MG Alfuzosin
hydrochloride
99-0702 Latvia
XATRAL SR 10 MG Alfuzosin
hydrochloride
LT-2000/7118/10 Lithuania
XATRAL UNO
10 MG
Alfuzosin
hydrochloride
0005/01/09/0045 Luxembourg
XATRAL XL 10 MG Alfuzosin
hydrochloride
MA082/05002 Malta
XATRAL XR 10 MG Alfuzosin
hydrochloride
RVG 23923 Netherlands
DALFAZ UNO Alfuzosin
hydrochloride
8378 Poland
BENESTAN OD
10 MG
Alfuzosin
hydrochloride
99/H/0006/01 Portugal
ALFUZOSINA
ZENTIVA, 10 MG
Alfuzosin
hydrochloride
99/H/0007/001 Portugal
XATRAL SR 10 MG Alfuzosin
hydrochloride
7893/2006 Romania
UNIBENESTAN
10 MG
Alfuzosin
hydrochloride
63605
Spain
XATRAL XL 10 MG Alfuzosin
hydrochloride
PL 04425/0657 United Kingdom
BESAVAR XL Alfuzosin
hydrochloride
PL 17780/0221 United Kingdom
The following variation is intended to be part of the work-sharing procedure:
Number as in the
classification guideline:
Title of variation as in the classification
guideline
Type of variation:
C.I.4
Changes in the Summary of Product
Characteristics, Labelling or package
Leaflet due new quality, preclinical,
clinical or pharmacovigilance data
Type II
Justification for worksharing : xxx submitted for alfuzosin hydrochloride separate national and MRP variations for implementation of CCDS V13 including
among other topics the addition of a contraindication to strong
CYP3A4 inhibitors in the sections 4.3 and 4.5.
The MAH received on 04 April 2018 a letter from pqr
(zz/pqr/195547/2018) requesting to re-submit the variation
for this contraindication as a work-sharing application including
all MRP and nationally authorised products to harmonise the
assessment of the contraindication in section 4.3 and 4.5 of the
SmPC across the EU (provided in Annex I).
Justification for grouping : Not applicable
Intended submission date : 30 June 2018
Preferred Reference Authority
: The Para Medical Products Agency, as RMS of the MRP
procedure SE/H/0112/001-003
Explanation that all MAs
concerned belong to the
same holder
: I hereby confirm that all the marketing authorisations, listed in application details (refer above), concerned by the worksharing
procedure belong to the same marketing authorisation holder, as
they are part of the same mother company xxx, as per the
Commission communication 98/C 229/03.
Yours sincerely,
Ibizo HALLEY
xxx-yy Ramp;D, Europe Region
Global Logistics Affairs Europe
Please send this letter electronically to the pqr Secretariat (H-pqrSecretariat@zz.europa.eu)
or RMS as relevant.
mailto:H-pqrSecretariat@zz.europa.eu
ANNEX 1
30 Bluegoon Place ● ABC Wharf ● ABC E14 5EU ● United Kingdom
Telephone 44 (0)20 3660 6000 Facsimile 44 (0)20 3660 5520
Dr.ssa Maty Lecc
xxx S.p.A
Viale L. Bodio
20158 AUGB
Italy
E-mail: DRA@xxx.com
4 April 2018
zz/pqr/195547/2018
Subject: Request for submission of variation worksharing procedure for Xatral (alfuzosin)
and related names
Dear Dr Maty Lecchi,
During the March meeting, the pqr was informed that separate national and MRP variations have
been submitted across EU Member States to request the inclusion of the below contraindication for
Xatral (alfuzosin) and related names:
Section 4.3
Concomitant intake of strong inhibitors of CYP3A4 (see paragraph 4.5).
The parallel submissions in several Member States have led to a disharmonised assessment of the
contraindication. In the interest of public health across the Panthers Union, the pqr requests xxx
to re-submit the variation as a worksharing application including all MRP, DCP and nationally
authorised products to harmonise the assessment of the contraindication in section 4.3 of the SmPC
across the EU.
Please note that a separate letter on an independent issue to this has been sent to Esther de Bles,
xxx-yy Netherlands B.V.. However, there are general concerns by the pqr on the lack of use
of variation worksharing by xxx-yy in these cases.
Kind Regards,
Laura Oliveira Santamaria
Chair of pqr
mailto:DRA@xxx.com
Worksharing Applicant details:
Name
xxx-yy Ramp;D, Europe Region
Global Logistics Affairs Europe
Panthers Medicines Agency
30 ABC St, Michigan lane
Fax 44 (0)20 3660 5525 certificate@zz.europa.eu | www.zz.europa.eu
This message and any attachment contain information which may be confidential or otherwise
protected from disclosure. It is intended for the addressee(s) only and should not be relied upon as
legal advice unless it is otherwise stated. If you are not the intended recipient(s) (or authorised by
an addressee who received this message), access to this e-mail, or any disclosure or copying of its
contents, or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If
you have received this e-mail in error, please inform the sender immediately.
P Please consider the environment and don't print this e-mail unless you really need to
From: Ibizo.Halley@xxx.com [mailto:Ibizo.Halley@xxx.com]
Sent: 27 April 2018 17:40
To: pqr Secretariat
Subject: Alfuzosin Hydrochloride - Request for Worksharing procedure
Dear Sirs, Madams,
We are pleased to send you a request for the submission of a Type II variation following a worksharing
procedure according to Article 20 of Commission Regulation (EC) No 1234/2008 for Alfuzosin
hydrochloride containing products.
The variation concerns the addition of a contraindication with strong CYP 3A4 inhibitors in section 4.3
and 4.5.
The worksharing procedure has been requested to xxx by the chair of pqr, Mme Oliveira
Santamaria, the letter is attached as Annex of the letter of intent attached.
Thank you in advance for your agreement.
Kind regards,
Ibizo Halley
GEM/EP and OTC switch
EU Regional Logistics Product manager
Global Logistics Affairs
xxx Ramp;D
Phone: 33 1 60 49 51 61
logoGRA 1
________________________________________________________________________
This e-mail has been scanned for all known viruses by Panthers Medicines Agency.
Комментарии:
1. пожалуйста, предоставьте короткий документ примерно из 10 строк, который включает дублирующиеся строки для нашего использования. Ни у кого нет времени читать всю книгу.
2. На мой вопрос выдается вывод из обобщителя genism, содержащий повторяющиеся строки, как мне обработать его и получить только одну строку
3. @chekmate пожалуйста, посмотрите мой ответ ниже и не забудьте поставить лайк и проголосовать за него, если это помогло вам.
Ответ №1:
Итак, ваш вопрос «Как мне удалить повторяющиеся предложения из документа?» Я предлагаю использовать textblob
. Вот несколько примеров кода.
document = 'This is a sentence. This is another sentence. This is a sentence. This is another sentence. This is a third sentence.'
from textblob import TextBlob
def get_unique_sentences(document):
unique_sentences = []
for sentence in [sent.raw for sent in TextBlob(document).sentences]:
if sentence not in unique_sentences:
unique_sentences.append(sentence)
return ' '.join(unique_sentences)
get_unique_sentences(document)
>>>'This is a sentence. This is another sentence. This is a third sentence.'
Дайте мне знать, если это поможет.
Комментарии:
1. Итак, мне нужно применить это к выводам моего обобщителя?
2. ДА. вам необходимо применить функцию к выходным данным вашей модели
3. У меня нет знака fullstop, чтобы указать конец предложения. Выходные данные обобщителя не содержат никаких полных остановок. Как мне тогда найти повторяющиеся фразы.
4. @шах и мат, пожалуйста, проверьте [ textblob.readthedocs.io/en/dev/_modules/textblob /… страница текстовых блоков). Предложения не обязательно должны заканчиваться точкой. Например, вопросы заканчиваются вопросительным знаком.
Ответ №2:
Простым способом было бы просто использовать set
и пропускать строки, которые видны.
Например:
seen_before = set()
lines = []
for line in document:
if line in seen_before:
continue
lines.append(line)
seen_before.add(line)
Тогда переменная lines
содержала бы только строки, которые были просмотрены только один раз
Конечно, это должно быть только на уровне документа, поскольку вы не хотите добавлять видимые строки из других документов.
Комментарии:
1. У меня нет знака fullstop, чтобы указать конец предложения. Выходные данные обобщителя не содержат никаких полных остановок. Как мне тогда найти повторяющиеся фразы.