Размерная ошибка в нелингвистическом наборе данных в качестве входных данных для модели кодирования-декодирования на основе LSTM с использованием внимания

#multidimensional-array #lstm #tf.keras #attention-model #encoder-decoder

Вопрос:

Я пытаюсь реализовать модель декодера кодера на основе LSTM, основанную на внимании, для многоклассовой классификации. Набор данных не является лингвистическим по своей природе. Характеристики моего набора данных:

     x_train.shape = (930,5)
    y_train.shape = (405,5)
    x_test.shape = (930,3)
    y_test.shape = (405,3)

    x_train.head()
        val1    val2    val3    val4    val5
        10000   00101   01000   10000   00110
        10000   00101   01000   10000   00110
        00010   01001   01001   01000   00110
        00100   01000   01001   01000   00111
        00101   01000   01001   01000   00110

 

Затем я преобразовал значения фрейма данных в массив:

 
    x_tr = np.array(x_train)
    array([['10000', '00101', '01000', '10000', '00110'],
           ['10000', '00101', '01000', '10000', '00110'],
           ['00010', '01001', '01001', '01000', '00110'],
           ...,
           ['01001', '00101', '01001', '01001', '00110'],
           ['00101', '01000', '01001', '01000', '00110'],
           ['00100', '01000', '01001', '01000', '00111']], dtype=object)

 

Затем я изменил свой массив в 3D, чтобы его можно было использовать в качестве входных данных для модели enc-dec на основе LSTM:

 
    X_TR = np.reshape(x_tr, (930, 5, -1))
    Y_TR = np.reshape(y_tr, (930, 3, -1))
    X_TE = np.reshape(x_te, (405, 5, -1))
    Y_TE = np.reshape(y_te, (405, 3, -1))
    
    print(X_TR.shape, x_tr.shape)
    (930, 5, 1) (930, 5)

 

Теперь я определяю простую модель, но получаю ошибку, вставленную под кодом

 
    def main():
        time_steps, input_dim, output_dim = 5, 5, 3
        model_input = Input(shape=(time_steps, input_dim))
        x = LSTM(64, return_sequences=True)(model_input)
        x = Attention(32)(x)
        x = Dense(1)(x)
        model = Model(model_input, x)
        model.compile(loss='mae', optimizer='adam')
        print(model.summary())
        model.fit(X_TR, Y_TR, epochs=10)
    
        # test save/reload model.
        pred1 = model.predict(X_TE)
       
        np.testing.assert_almost_equal(pred1, Y_TE)
        print('Success.')
    
    
    if __name__ == '__main__':
        main()

 

The output is as follows:

 
    Model: "model"
    __________________________________________________________________________________________________
    Layer (type)                    Output Shape         Param #     Connected to                     
    ==================================================================================================
    input_1 (InputLayer)            [(None, 5, 5)]       0                                            
    __________________________________________________________________________________________________
    lstm (LSTM)                     (None, 5, 64)        17920       input_1[0][0]                    
    __________________________________________________________________________________________________
    last_hidden_state (Lambda)      (None, 64)           0           lstm[0][0]                       
    __________________________________________________________________________________________________
    attention_score_vec (Dense)     (None, 5, 64)        4096        lstm[0][0]                       
    __________________________________________________________________________________________________
    attention_score (Dot)           (None, 5)            0           last_hidden_state[0][0]          
                                                                     attention_score_vec[0][0]        
    __________________________________________________________________________________________________
    attention_weight (Activation)   (None, 5)            0           attention_score[0][0]            
    __________________________________________________________________________________________________
    context_vector (Dot)            (None, 64)           0           lstm[0][0]                       
                                                                     attention_weight[0][0]           
    __________________________________________________________________________________________________
    attention_output (Concatenate)  (None, 128)          0           context_vector[0][0]             
                                                                     last_hidden_state[0][0]          
    __________________________________________________________________________________________________
    attention_vector (Dense)        (None, 128)          16384       attention_output[0][0]           
    __________________________________________________________________________________________________
    dense (Dense)                   (None, 1)            129         attention_vector[0][0]           
    ==================================================================================================
    Total params: 38,529
    Trainable params: 38,529
    Non-trainable params: 0
    __________________________________________________________________________________________________
    None
    Epoch 1/10
    WARNING:tensorflow:AutoGraph could not transform <function Model.make_train_function.<locals>.train_function at 0x000002813E5532F0> and will run it as-is.
    Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
    Cause: 'arguments' object has no attribute 'posonlyargs'
    To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
    WARNING: AutoGraph could not transform <function Model.make_train_function.<locals>.train_function at 0x000002813E5532F0> and will run it as-is.
    Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
    Cause: 'arguments' object has no attribute 'posonlyargs'
    To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
    WARNING:tensorflow:Model was constructed with shape (None, 5, 5) for input KerasTensor(type_spec=TensorSpec(shape=(None, 5, 5), dtype=tf.float32, name='input_1'), name='input_1', description="created by layer 'input_1'"), but it was called on an input with incompatible shape (None, 5, 1).
    ---------------------------------------------------------------------------
    AttributeError                            Traceback (most recent call last)
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonautographimplapi.py in converted_call(f, args, kwargs, caller_fn_scope, options)
        446     program_ctx = converter.ProgramContext(options=options)
    --> 447     converted_f = _convert_actual(target_entity, program_ctx)
        448     if logging.has_verbosity(2):
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonautographimplapi.py in _convert_actual(entity, program_ctx)
        283 
    --> 284   transformed, module, source_map = _TRANSPILER.transform(entity, program_ctx)
        285 
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonautographpycttranspiler.py in transform(self, obj, user_context)
        285     if inspect.isfunction(obj) or inspect.ismethod(obj):
    --> 286       return self.transform_function(obj, user_context)
        287 
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonautographpycttranspiler.py in transform_function(self, fn, user_context)
        469           # TODO(mdan): Confusing overloading pattern. Fix.
    --> 470           nodes, ctx = super(PyToPy, self).transform_function(fn, user_context)
        471 
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonautographpycttranspiler.py in transform_function(self, fn, user_context)
        362     node = self._erase_arg_defaults(node)
    --> 363     result = self.transform_ast(node, context)
        364 
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonautographimplapi.py in transform_ast(self, node, ctx)
        251     unsupported_features_checker.verify(node)
    --> 252     node = self.initial_analysis(node, ctx)
        253 
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonautographimplapi.py in initial_analysis(self, node, ctx)
        239     node = qual_names.resolve(node)
    --> 240     node = activity.resolve(node, ctx, None)
        241     node = reaching_definitions.resolve(node, ctx, graphs)
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonautographpyctstatic_analysisactivity.py in resolve(node, context, parent_scope)
        708 def resolve(node, context, parent_scope=None):
    --> 709   return ActivityAnalyzer(context, parent_scope).visit(node)
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonautographpycttransformer.py in visit(self, node)
        444 
    --> 445       result = super(Base, self).visit(node)
        446 
    
    G:anacondaenvstensorflow_envlibast.py in visit(self, node)
        252         visitor = getattr(self, method, self.generic_visit)
    --> 253         return visitor(node)
        254 
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonautographpyctstatic_analysisactivity.py in visit_FunctionDef(self, node)
        578       # Argument annotartions (includeing defaults) affect the defining context.
    --> 579       node = self._visit_arg_annotations(node)
        580 
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonautographpyctstatic_analysisactivity.py in _visit_arg_annotations(self, node)
        554     self._track_annotations_only = True
    --> 555     node = self._visit_arg_declarations(node)
        556     self._track_annotations_only = False
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonautographpyctstatic_analysisactivity.py in _visit_arg_declarations(self, node)
        559   def _visit_arg_declarations(self, node):
    --> 560     node.args.posonlyargs = self._visit_node_list(node.args.posonlyargs)
        561     node.args.args = self._visit_node_list(node.args.args)
    
    AttributeError: 'arguments' object has no attribute 'posonlyargs'
    
    During handling of the above exception, another exception occurred:
    
    ValueError                                Traceback (most recent call last)
    <ipython-input-6-8f7d95d574b4> in <module>
         20 
         21 if __name__ == '__main__':
    ---> 22     main()
    
    <ipython-input-6-8f7d95d574b4> in main()
          8     model.compile(loss='mae', optimizer='adam')
          9     print(model.summary())
    ---> 10     model.fit(X_TR, Y_TR, epochs=10)
         11 
         12     # test save/reload model.
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonkerasenginetraining.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing)
       1079                 _r=1):
       1080               callbacks.on_train_batch_begin(step)
    -> 1081               tmp_logs = self.train_function(iterator)
       1082               if data_handler.should_sync:
       1083                 context.async_wait()
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythoneagerdef_function.py in __call__(self, *args, **kwds)
        826     tracing_count = self.experimental_get_tracing_count()
        827     with trace.Trace(self._name) as tm:
    --> 828       result = self._call(*args, **kwds)
        829       compiler = "xla" if self._experimental_compile else "nonXla"
        830       new_tracing_count = self.experimental_get_tracing_count()
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythoneagerdef_function.py in _call(self, *args, **kwds)
        869       # This is the first call of __call__, so we have to initialize.
        870       initializers = []
    --> 871       self._initialize(args, kwds, add_initializers_to=initializers)
        872     finally:
        873       # At this point we know that the initialization is complete (or less
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythoneagerdef_function.py in _initialize(self, args, kwds, add_initializers_to)
        724     self._concrete_stateful_fn = (
        725         self._stateful_fn._get_concrete_function_internal_garbage_collected(  # pylint: disable=protected-access
    --> 726             *args, **kwds))
        727 
        728     def invalid_creator_scope(*unused_args, **unused_kwds):
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythoneagerfunction.py in _get_concrete_function_internal_garbage_collected(self, *args, **kwargs)
       2974       args, kwargs = None, None
       2975     with self._lock:
    -> 2976       graph_function, _ = self._maybe_define_function(args, kwargs)
       2977     return graph_function
       2978 
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythoneagerfunction.py in _maybe_define_function(self, args, kwargs)
       3369 
       3370           self._function_cache.missed.add(call_context_key)
    -> 3371           graph_function = self._create_graph_function(args, kwargs)
       3372           self._function_cache.primary[cache_key] = graph_function
       3373 
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythoneagerfunction.py in _create_graph_function(self, args, kwargs, override_flat_arg_shapes)
       3214             arg_names=arg_names,
       3215             override_flat_arg_shapes=override_flat_arg_shapes,
    -> 3216             capture_by_value=self._capture_by_value),
       3217         self._function_attributes,
       3218         function_spec=self.function_spec,
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonframeworkfunc_graph.py in func_graph_from_py_func(name, python_func, args, kwargs, signature, func_graph, autograph, autograph_options, add_control_dependencies, arg_names, op_return_value, collections, capture_by_value, override_flat_arg_shapes)
        988         _, original_func = tf_decorator.unwrap(python_func)
        989 
    --> 990       func_outputs = python_func(*func_args, **func_kwargs)
        991 
        992       # invariant: `func_outputs` contains only Tensors, CompositeTensors,
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythoneagerdef_function.py in wrapped_fn(*args, **kwds)
        632             xla_context.Exit()
        633         else:
    --> 634           out = weak_wrapped_fn().__wrapped__(*args, **kwds)
        635         return out
        636 
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonframeworkfunc_graph.py in wrapper(*args, **kwargs)
        971                     recursive=True,
        972                     optional_features=autograph_options,
    --> 973                     user_requested=True,
        974                 ))
        975           except Exception as e:  # pylint:disable=broad-except
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonautographimplapi.py in converted_call(f, args, kwargs, caller_fn_scope, options)
        452     if is_autograph_strict_conversion_mode():
        453       raise
    --> 454     return _fall_back_unconverted(f, args, kwargs, options, e)
        455 
        456   with StackTraceMapper(converted_f), tf_stack.CurrentModuleFilter():
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonautographimplapi.py in _fall_back_unconverted(f, args, kwargs, options, exc)
        499     logging.warn(warning_template, f, file_bug_message, exc)
        500 
    --> 501   return _call_unconverted(f, args, kwargs, options)
        502 
        503 
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonautographimplapi.py in _call_unconverted(f, args, kwargs, options, update_cache)
        476 
        477   if kwargs is not None:
    --> 478     return f(*args, **kwargs)
        479   return f(*args)
        480 
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonkerasenginetraining.py in train_function(iterator)
        788       def train_function(iterator):
        789         """Runs a training execution with one step."""
    --> 790         return step_function(self, iterator)
        791 
        792     else:
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonkerasenginetraining.py in step_function(model, iterator)
        778 
        779       data = next(iterator)
    --> 780       outputs = model.distribute_strategy.run(run_step, args=(data,))
        781       outputs = reduce_per_replica(
        782           outputs, self.distribute_strategy, reduction='first')
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythondistributedistribute_lib.py in run(***failed resolving arguments***)
       1266       fn = autograph.tf_convert(
       1267           fn, autograph_ctx.control_status_ctx(), convert_by_default=False)
    -> 1268       return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
       1269 
       1270   def reduce(self, reduce_op, value, axis):
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythondistributedistribute_lib.py in call_for_each_replica(self, fn, args, kwargs)
       2732       kwargs = {}
       2733     with self._container_strategy().scope():
    -> 2734       return self._call_for_each_replica(fn, args, kwargs)
       2735 
       2736   def _call_for_each_replica(self, fn, args, kwargs):
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythondistributedistribute_lib.py in _call_for_each_replica(self, fn, args, kwargs)
       3353   def _call_for_each_replica(self, fn, args, kwargs):
       3354     with ReplicaContext(self._container_strategy(), replica_id_in_sync_group=0):
    -> 3355       return fn(*args, **kwargs)
       3356 
       3357   def _reduce_to(self, reduce_op, value, destinations, experimental_hints):
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonautographimplapi.py in wrapper(*args, **kwargs)
        665       try:
        666         with conversion_ctx:
    --> 667           return converted_call(f, args, kwargs, options=options)
        668       except Exception as e:  # pylint:disable=broad-except
        669         if hasattr(e, 'ag_error_metadata'):
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonautographimplapi.py in converted_call(f, args, kwargs, caller_fn_scope, options)
        394 
        395   if not options.user_requested and conversion.is_allowlisted(f):
    --> 396     return _call_unconverted(f, args, kwargs, options)
        397 
        398   # internal_convert_user_code is for example turned off when issuing a dynamic
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonautographimplapi.py in _call_unconverted(f, args, kwargs, options, update_cache)
        476 
        477   if kwargs is not None:
    --> 478     return f(*args, **kwargs)
        479   return f(*args)
        480 
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonkerasenginetraining.py in run_step(data)
        771 
        772       def run_step(data):
    --> 773         outputs = model.train_step(data)
        774         # Ensure counter is updated only if `train_step` succeeds.
        775         with ops.control_dependencies(_minimum_control_deps(outputs)):
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonkerasenginetraining.py in train_step(self, data)
        737 
        738     with backprop.GradientTape() as tape:
    --> 739       y_pred = self(x, training=True)
        740       loss = self.compiled_loss(
        741           y, y_pred, sample_weight, regularization_losses=self.losses)
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonkerasenginebase_layer.py in __call__(self, *args, **kwargs)
       1001         with autocast_variable.enable_auto_cast_variables(
       1002             self._compute_dtype_object):
    -> 1003           outputs = call_fn(inputs, *args, **kwargs)
       1004 
       1005         if self._activity_regularizer:
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonkerasenginefunctional.py in call(self, inputs, training, mask)
        423     """
        424     return self._run_internal_graph(
    --> 425         inputs, training=training, mask=mask)
        426 
        427   def compute_output_shape(self, input_shape):
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonkerasenginefunctional.py in _run_internal_graph(self, inputs, training, mask)
        558 
        559         args, kwargs = node.map_arguments(tensor_dict)
    --> 560         outputs = node.layer(*args, **kwargs)
        561 
        562         # Update tensor_dict.
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonkeraslayersrecurrent.py in __call__(self, inputs, initial_state, constants, **kwargs)
        658 
        659     if initial_state is None and constants is None:
    --> 660       return super(RNN, self).__call__(inputs, **kwargs)
        661 
        662     # If any of `initial_state` or `constants` are specified and are Keras
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonkerasenginebase_layer.py in __call__(self, *args, **kwargs)
        987         inputs = self._maybe_cast_inputs(inputs, input_list)
        988 
    --> 989       input_spec.assert_input_compatibility(self.input_spec, inputs, self.name)
        990       if eager:
        991         call_fn = self.call
    
    G:anacondaenvstensorflow_envlibsite-packagestensorflowpythonkerasengineinput_spec.py in assert_input_compatibility(input_spec, inputs, layer_name)
        272                              ' is incompatible with layer '   layer_name  
        273                              ': expected shape='   str(spec.shape)  
    --> 274                              ', found shape='   display_shape(x.shape))
        275 
        276 
    
    ValueError: Input 0 is incompatible with layer lstm: expected shape=(None, None, 5), found shape=(None, 5, 1)

 

Я не понимаю, в чем здесь ошибка.
Помощь была бы весьма признательна.
Спасибо