как захватить дочерний шаблон в родительском шаблоне

#python-3.x #regex-group #regex-greedy

#python-3.x #regex-group #регулярное выражение-жадный

Вопрос:

У меня есть следующий шаблон (используемый с re.DOTALL ) и текст

 (?P<year>20d{2}).*?(?P<currency>RMBd{1,3}(Wd{3})*)
  
 The financial statements for 2019 prepared in accordance with Accounting Standards for Business Enterprises by
the Group were audited by Grant Thornton (Special General Partnership). The Company paid the auditor in aggregate
RMB2,500,000 and RMB800,000 in respect of financial statements audit and non-audit services in relation to internal
control for 2018 respectively. The financial statements for 2019 prepared in accordance with Accounting Standards for Business Enterprises by
the Group were audited by Grant Thornton (Special General Partnership). The Company paid the auditor in aggregate
RMB2,500,000 and RMB800,000 in respect of financial statements audit and non-audit services in relation to internal
control for 2019 respectively.
  

это дает следующий результат.

 Match 1
Named groups
currency    RMB2,500,000
year    2019
All groups
1.  2019
2.  RMB2,500,000
3.  ,000
Match 2
Named groups
currency    RMB2,500,000
year    2018
All groups
1.  2018
2.  RMB2,500,000
3.  ,000
  

но я хочу иметь всю сумму валюты за соответствующий год, желаемый результат следующий

 Match 1
Named groups
currency    RMB2,500,000
currency    RMB800,000
year    2019
All groups
1.  2019
2.  RMB2,500,000
3.  ,000
4. RMB800,000
5. ,000
Match 2
Named groups
currency    RMB2,500,000
currency    RMB800,000
year    2018
All groups
1.  2018
2.  RMB2,500,000
3.  ,000
4. RMB800,000
5. ,000
  

как я могу это сделать?
вот <a rel="noreferrer noopener nofollow" href="https://pythex.org/?regex=(?P20d{2}).*?(?PRMBd{1,3}(Wd{3})*)amp;test_string=The financial statements for 2019 prepared in accordance with Accounting Standards for Business Enterprises by
the Group were audited by Grant Thornton (Special General Partnership). The Company paid the auditor in aggregate
RMB2,500,000 and RMB800,000 in respect of financial statements audit and non-audit services in relation to internal
control for 2018 respectively. The financial statements for 2019 prepared in accordance with Accounting Standards for Business Enterprises by
the Group were audited by Grant Thornton (Special General Partnership). The Company paid the auditor in aggregate
RMB2,500,000 and RMB800,000 in respect of financial statements audit and non-audit services in relation to internal
control for 2019 respectively.amp;ignorecase=0amp;multiline=0amp;dotall=1amp;verbose=0″ rel=»nofollow noreferrer»>ссылка на регулярное выражение

Комментарии:

1.Вы можете добавить необязательную группу (?P<year>20d{2}).*?(?P<currency>RMBd{1,3}(Wd{3})*(?: and (RMBd{1,3}(Wd{3})*))?) regex101.com/r/mxiGGd/1

2. @Thefourthbird что произойдет, если у меня будет несколько валютных номеров в течение года, например 2019 foo RMB111,111 bar RMB111,111 bar RMB111,111 bar RMB111,111 ?

3. Шаблон не является динамическим, тогда вам придется использовать несколько необязательных групп. Как вы узнаете, какой год вам нужен, потому что в этом предложении их несколько in respect of financial statements audit and non-audit services in relation to internal control for 2018 respectively. The financial statements for 2019 prepared in accordance with Accounting Standards for Business Enterprises by the Group were audited by Grant Thornton (Special General Partnership).