Trieste Encounters on Cognitive Science, Language Learning

Name: Trieste Encounters on Cognitive Science, Language Learning
Start: 2016-07-07T08:00:00+02:00
End: 2016-07-15T18:00:00+02:00
Location: SISSA main building

Jul 7 – 15, 2016

SISSA main building

Europe/Rome timezone

Support

tex@sissa.it

Minimal Complexity Extreme Learning Machines

Not scheduled

20m

Meeting room (7th floor) (SISSA main building)

Meeting room (7th floor)

SISSA main building

via Bonomea 265, 34136, Trieste, Italy

Student

Mr Sumit Soman (PhD Candidate)

Learning sparse representations and minimizing model complexity have gained much interest recently. Parsimonious models are expected to generalize well, are easier to implement, and lead to smaller test times. The recently proposed Minimal Complexity Machine (MCM) showed that for training data

X = {(x_{i}, y_{i}) | x_{i} \in R^{n}, y_{i} \in R, i = 1, 2, . . . M}

, minimizing

h^{2}

, where

h = \frac{max_{i = 1, 2, . . ., M} ‖ u^{T} x^{i} + v ‖}{min_{i = 1, 2, . . ., M} ‖ u^{T} x^{i} + v ‖} .

leads to a hyperplane classifier

u^{T} x + v = 0

with a small VC dimension. This task was shown to be equivalent to

min_{w, b, h} h + C \cdot \sum_{i = 1}^{M} q_{i}

h \geq y_{i} \cdot [w^{T} x^{i} + b] + q_{i}, i = 1, 2, . . ., M

y_{i} \cdot [w^{T} x^{i} + b] + q_{i} \geq 1, i = 1, 2, . . ., M

q_{i} \geq 0, i = 1, 2, . . ., M .

Models such as the Extreme Learning Machine (ELM) and Random Vector Functional Link Network (RVFLN) have been adapted to a number of applications and offer several advantages. Typically, the ELM solves

min_{β, ξ} \frac{1}{2} | | β | |^{2} + \frac{1}{2} C \sum_{i}^{M} ξ_{i}^{2}

h (x_{i}) β = y_{i} - ξ_{i}, i = 1, 2, . . ., M

The last layer of the ELM network conventionally involves the computation of a pseudo-inverse; the hidden layer output matrix

H

is computed as a solution to

H β = Y

, where

H (w_{1}, w_{2}, . . ., w_{\hat{n}}, b_{1}, b_{2}, . . ., b_{\hat{n}}, x_{1}, x_{2}, . . ., x_{M}) = g (w_{i} \cdot x_{i} + b)

β_{i} = [β_{i 1}, β_{i 2}, . . ., β_{i n}]^{T}

is the weight vector connecting the

i^{t h}

hidden node and output nodes,

w = [w_{i 1}, w_{i 2}, . . ., w_{i n}]^{T}

is the weight vector connecting the

i^{t h}

hidden node and input nodes, and

Y

is the vector of

y_{i}

's. We propose combining the ELM with the MCM. This allows us to build classifiers or regressors with lower complexity in terms of VC dimension, which induce sparsity in the connections between the neurons of the final layer of the network. This has shown to not only improve generalization, but also create sparser networks which depict models closer to human cognition. Numerical stability issues associated with the calculation of the pseudo-inverse are also avoided.

Prof. Jayadeva Dr (Department of Electrical Engineering, Indian Institute of Technology, Delhi, India) Mr Sumit Soman (PhD Candidate)

There are no materials yet.

Trieste Encounters on Cognitive Science, Language Learning

Support

Minimal Complexity Extreme Learning Machines

Meeting room (7th floor)

SISSA main building

Speaker

Description

Primary authors

Presentation materials

Trieste Encounters on Cognitive Science, Language Learning

Support

Speaker

Description

Primary authors

Presentation materials

Share this page

Direct link

Social networks

Calendaring