Train the module and criterion given in the
constructor over dataset, using the
internal parameters.
StochasticGradient expect as a dataset an object which implements the operator
dataset[index] and implements the method dataset:size(). The size() methods
returns the number of examples and dataset[i] has to return the i-th example.
An example has to be an object which implements the operator
example[field], where field might take the value 1 (input features)
or 2 (corresponding label which will be given to the criterion).
The input is usually a Tensor (except if you use special kind of gradient modules,
like table layers). The label type depends of the criterion.
For example, the MSECriterion expects a Tensor, but the
ClassNLLCriterion except a integer number (the class).
Such a dataset is easily constructed by using Lua tables, but it could any C object
for example, as long as required operators/methods are implemented.
See an example.