## Greedy algorithm python – Coin change problem

A greedy python algorithm (greedy algorithm python) greedily selects the best choice at every step. He hopes that these choices lead to the optimal overall solution to the problem. So, a greedy algorithm does not always give the best solution. However in many problems this is the case.

## Greedy Algorithm: Introduction

The problem of giving change is formulated as follows. How to return a given sum with a minimum of coins and banknotes?

Here is an example in python of the resolution of the problem:

If we consider the Euro monetary system without the cents we have the whole

EURO = (1, 2, 5, 10, 20, 50, 100, 200, 500)

## Greedy algorithm python : Coin change problem

Now, to give change to an x ​​value of using these coins and banknotes, then we will check the first element in the array. And if it’s greater than x, we move on to the next element. Otherwise let’s keep it. Now, after taking a valuable coin or bill from the array of coinAndBill [i], the total value x we ​​need to do will become x – coinAndBill [i].

Here is the associated greedy python algorithm:

def renduMonnaieGlouton(x):
pieceEtBillets = [500,200,100,50,20,10,5,2,1]
i = 0

while(x>0):
if(pieceEtBillets[i] > x):
i = i+1
else:
print(str(pieceEtBillets[i]))
x -= pieceEtBillets[i];

renduMonnaieGlouton(33)#Exemple sur 33 euros

The output for 33 euro is then:

20
10
2
1

Another example with 55 euro of greedy python algorithm:

def renduMonnaieGlouton(x):
pieceEtBillets = [500,200,100,50,20,10,5,2,1]
i = 0

while(x>0):
if(pieceEtBillets[i] > x):
i = i+1
else:
print(str(pieceEtBillets[i]))
x -= pieceEtBillets[i];

renduMonnaieGlouton(55)#Exemple sur 55 euros

Output :

50
5

## Conclusion

The problem of giving change is NP-difficult relative to the number of coins and notes of the monetary system considered (euro in this example). To go further, we can demonstrate that for certain so-called canonical money systems, the use of a greedy algorithm is optimal. A monetary system is said to be canonical if for any sum s the greedy algorithm leads to a minimal decomposition.

NP difficulty is the defining property of a class of problems informally “at least as difficult as the most difficult NP problems”.

A simple example of an NP-hard problem is the sum of subset problem. If P is different from NP then it is unlikely to find a polynomial time algorithm that exactly solves this problem.

http://math.univ-lyon1.fr/irem/IMG/pdf/monnaie.pdf

http://www.dil.univ-mrs.fr/~gcolas/algo-licence/slides/gloutons.pdf

## Internal links python greedy algorithm :

http://128mots.com/index.php/category/python/

http://128mots.com/index.php/category/non-classe/

## Levenshtein Python Algo implementation Wagner & Fischer

Here is the levenshtein python implementation of the Wagner & Fischer algorithm (Wagner-Fischer). It allows to calculate the distance of Levenshtein (distance between two strings of characters).

First, the goal of the algorithm is to find the minimum cost. Minimal cost to transform one channel into another. Then a recursive function allows to return the minimum number of transformation. And to transform a substring of A with n characters into a substring of B with the same number of characters. The solution is given. We must calculate the distance between the 2 letters at the same position in the chain A and B. Then Either the letters and the previous letter are identical. Either there is a difference and in this case we calculate 3 costs. So the first is to delete a letter from the chain A. And insert a letter in the chain A to substitute a letter from the chain A. Then we can then find the minimum cost.

Example of levenshtein python implementation :

import numpy as np

def levenshtein(chaine1, chaine2):
taille_chaine1 = len(chaine1) + 1
taille_chaine2 = len(chaine2) + 1
levenshtein_matrix = np.zeros ((taille_chaine1, taille_chaine2))
for x in range(taille_chaine1):
levenshtein_matrix [x, 0] = x
for y in range(taille_chaine2):
levenshtein_matrix [0, y] = y

for x in range(1, taille_chaine1):
for y in range(1, taille_chaine2):
if chaine1[x-1] == chaine2[y-1]:
levenshtein_matrix [x,y] = min(
levenshtein_matrix[x-1, y] + 1,
levenshtein_matrix[x-1, y-1],
levenshtein_matrix[x, y-1] + 1
)
else:
levenshtein_matrix [x,y] = min(
levenshtein_matrix[x-1,y] + 1,
levenshtein_matrix[x-1,y-1] + 1,
levenshtein_matrix[x,y-1] + 1
)
return (levenshtein_matrix[taille_chaine1 - 1, taille_chaine2 - 1])

print("distance de levenshtein = " + str(levenshtein("Lorem ipsum dolor sit amet", "Laram zpsam dilir siy amot")))

To calculate the Levenshtein distance with a non-recursive algorithm. We use a matrix which contains the Levenshtein distances. Then These are the distances between all the prefixes of the first string and all the prefixes of the second string.

Also we can dynamically calculate the values ​​in this matrix. The last calculated value will be the Levenshtein distance between the two whole chains.

Finally there are many use cases of the Levenshtein distance. Levenshtein distance is used in domains. Computational linguistics, molecular biology. And again bioinformatics, machine learning. Also deep learning, DNA analysis. In addition, a program that does spell checking uses, for example, the Levenshtein distance.

## Links on the Wagner & Fischer algorithm (Wagner-Fischer) and the Levenshtein python distance or other language:

https://graal.hypotheses.org/tag/algorithme-de-wagner-fischer

https://fr.wikipedia.org/wiki/Algorithme_de_Wagner-Fischer

https://fr.wikipedia.org/wiki/Distance_de_Levenshtein

https://medium.com/@sddkal/wagner-fischer-algorithm-be0d96893f6d

https://www.python-course.eu/levenshtein_distance.php

http://128mots.com/index.php/category/python/

http://128mots.com/index.php/category/graphes/

## Quantum sort : sorting algorithm IBM Qiskit tutorial python

In this article, I propose a quantum sort algorithm with Qiskit tutorial in python that allows you to sort a list of integers coded on two qubits. I have found that there is little literature on the subject. This algorithm can be used and can be used as a basis for solving sorting problems on an unsorted list of integers.

## Introduction

We consider the following problem: Given a list of four integers between 0 and 3 that are strictly different sort the given list. The solution proposed is a quantum sort algorithm.

A classic sorting algorithm such as merge sorting solves the problem, see in more detail my article on the subject:

Merge sorting follows the divide and conquer paradigm of dividing the initial task into two similar smaller tasks.

## Some concepts of quantum algorithm and IBM Qiskit tutorial

I recall here some concept that we are going to use in our quantum sorting algorithm, so I advise you to consult the IBM Qiskit site to learn more https://qiskit.org/textbook/what-is-quantum.html and also to read the very good book “Programming Quantum Computers: Essential Algorithms and Code Samples” by Eric R. Johnston, Nic Harrigan, Mercedes Gimeno-Segovia:

###### Circuits

The principle in quantum algorithm is to manipulate qubits for inputs which we have in outputs which we need, the process can be represented in the form of circuit (inputs on the left, output on the right). We can perform different operations by means of quantum gate (which act on the qubits in a way similar in classical algorithm to logic gates like AND, OR, Exclusive OR…).

In this article I will rely on Qiskit which is the IBM framework for quantum computing.

from qiskit import QuantumRegister, ClassicalRegister, QuantumCircuit
from numpy import pi

qreg_qubit = QuantumRegister(2, 'qubit')
creg_classic = ClassicalRegister(2, 'classic')
circuit = QuantumCircuit(qreg_qubit, creg_classic)

circuit.reset(qreg_qubit[0])
circuit.reset(qreg_qubit[1])
circuit.x(qreg_qubit[0])
circuit.measure(qreg_qubit[0], creg_classic[0])
circuit.measure(qreg_qubit[1], creg_classic[1])

In this first algorithm, a quantum register is initialized to the state | 01> measures in a classical register the quantum state of a register of 2 qubits. The circuit can be drawn using the circuit.draw () instruction or via Circuit Composer from IBM Quantum Experience: https://quantum-computing.ibm.com/

###### Quantum state of a single qubit:

The state of a qubit can be expressed as a state vector:

With alpha and beta which are complex numbers (ie a real and imaginary part) in this case there is the relation (mathematical reminder on complex numbers: https://www.mathsisfun.com/numbers/complex-numbers.html):

If we go into trigonometric and exponential form we have with θ and ϕ which are real numbers.

So if we measure a qubit q we have:

That is, there is a probability of measuring the qubit in state | 0> and a probability of measuring the qubit in state | 1>. The measurement of a qubit reveals whether it is in state 1 or in state 0, in general the measurement is placed at the end of the circuit because it affects the state of the qubit accordingly.

###### Quantum gates

Here are some quantum gates to know that we will use in our quantum sorting algorithm:

Gate X (Pauli-X) :

Gate X is used to return the state of the qubit without modifying the phase:

<

See the example above with the use of qiskit.

Gate Y (Pauli-Y) :

The pauli Y gate allows you to return the phase and the qubit at the same time:

Example if we consider here the initialized state | 0> of the qubit and its phase of 0, there is then a 100% chance of measuring the state | 0>.

Applying the quantum gate Y has the following effect on the state and phase of the qubit:

There is then a 100% chance of measuring state 1 and the phase is reversed (see the bloch sphere above)

from qiskit import QuantumRegister, ClassicalRegister, QuantumCircuit
from numpy import pi

qreg_qubit = QuantumRegister(1, 'qubit')
creg_classic = ClassicalRegister(1, 'classic')
circuit = QuantumCircuit(qreg_qubit, creg_classic)

circuit.reset(qreg_qubit[0])
circuit.y(qreg_qubit[0])
circuit.measure(qreg_qubit[0], creg_classic[0])

Gate Z (Pauli-Z) :

The Z gate is used to return only the phase and not the state of the qubit.

Note that the Y and Z gates can be represented by the following matrices:

The Hadamard gate (H gate) is an important quantum gate which allows to create a superposition of | 0⟩ and | 1⟩.

The matrix of this gate is:

SWAP Gate

A SWAP gate allows the state of two qubits to be exchanged:

SWAP state example | 1⟩ from qubit 0 to qbit 2

And state SWAP | 0⟩ from qubit 1 to qbit 2

In Python with the Qiskit framework we have:

from qiskit import QuantumRegister, ClassicalRegister, QuantumCircuit
from numpy import pi

qreg_qubit = QuantumRegister(3, 'qubit')
creg_classic = ClassicalRegister(1, 'classic')
circuit = QuantumCircuit(qreg_qubit, creg_classic)

circuit.reset(qreg_qubit[0])
circuit.reset(qreg_qubit[1])
circuit.reset(qreg_qubit[2])
circuit.x(qreg_qubit[0])
circuit.swap(qreg_qubit[1], qreg_qubit[2])
circuit.measure(qreg_qubit[2], creg_classic[0])

Toffoli Gate

The Toffoli quantum gate is three qubits (two commands and one target).
It performs a phase reversal (quantum gate X on the target) only if both controls are in state | 1⟩.
A Toffoli is equivalent to an UNControlled gate (it is also called the CCX gate).

Example below with 2 command qubits in state | 1⟩ there is then a 100% chance of measuring the state | 1⟩ on the target qubit of the tofoli gate which was initially initialized to state | 0⟩ (or a reversal of the state).

Example below in the opposite case (there is then 0% chance of measuring state | 1⟩ or 100% chance of measuring state | 0⟩):

###### Classical gate AND gate, OR gate equivalence in quantum gate – IBM Qiskit tutorial

We see here how to combine doors to create a behavior equivalent to what we are used to using in classical circuits.

Gate AND :

From Qbit a and b and a qubit with a forced initialization to state | 0⟩ we can obtain the operation a AND b using a Toffoli quantum gate and a SWAP mate.

It should be noted that at the end of this circuit we find the state A AND B on qubit B the state of qubit B is ultimately on qubit C.

The equivalent python code is as follows:

from qiskit import QuantumRegister, ClassicalRegister, QuantumCircuit
from numpy import pi

qreg_a = QuantumRegister(1, 'a')
qreg_b = QuantumRegister(1, 'b')
qreg_c = QuantumRegister(1, 'c')
creg_aANDb = ClassicalRegister(1, 'aANDb')
creg_aState = ClassicalRegister(1, 'aState')
creg_bState = ClassicalRegister(1, 'bState')
circuit = QuantumCircuit(qreg_a, qreg_b, qreg_c, creg_aANDb, creg_aState, creg_bState)

circuit.reset(qreg_c[0])
circuit.ccx(qreg_a[0], qreg_b[0], qreg_c[0])
circuit.swap(qreg_b[0], qreg_c[0])
circuit.measure(qreg_b[0], creg_aANDb[0])
circuit.measure(qreg_a[0], creg_aState[0])
circuit.measure(qreg_c[0], creg_bState[0])

The equivalent of this code in OpenQASM assembler is:

OPENQASM 2.0;
include "qelib1.inc";

qreg a[1];
qreg b[1];
qreg c[1];
creg aANDb[1];
creg aState[1];
creg bState[1];

reset c[0];
ccx a[0],b[0],c[0];
swap b[0],c[0];
measure b[0] -> aANDb[0];
measure a[0] -> aState[0];
measure c[0] -> bState[0];

We will use this gate (in Quantum sort algorithm for example) to perform our quantum list sorting, here are some tests:

For | 0⟩ AND | 0⟩ you have | 0⟩ on qubit B:

For | 0⟩ AND | 1⟩ you have | 0⟩ on qubit B:

For | 1⟩ AND | 0⟩ you have | 0⟩ on qubit B:

Finally for | 1⟩ AND | 1⟩ you have good | 1⟩ on qubit B with a probability of 100%:

Gate OR :

We will use it in Quantum sort algorithm.

To realize an OR gate we combine 4 X quantum gates, a Toffoli gate and a SWAP quantum gate, also the state of C must be initialized to the state | 1>:

The code for this door in Qiskit:

from qiskit import QuantumRegister, ClassicalRegister, QuantumCircuit
from numpy import pi

qreg_a = QuantumRegister(1, 'a')
qreg_b = QuantumRegister(1, 'b')
qreg_c = QuantumRegister(1, 'c')
creg_aORb = ClassicalRegister(1, 'aORb')
creg_aState = ClassicalRegister(1, 'aState')
creg_bState = ClassicalRegister(1, 'bState')
circuit = QuantumCircuit(qreg_a, qreg_b, qreg_c, creg_aORb, creg_aState, creg_bState)

circuit.reset(qreg_a[0])
circuit.reset(qreg_b[0])
circuit.x(qreg_b[0])
circuit.x(qreg_c[0])
circuit.x(qreg_a[0])
circuit.ccx(qreg_a[0], qreg_b[0], qreg_c[0])
circuit.x(qreg_a[0])
circuit.x(qreg_b[0])
circuit.swap(qreg_b[0], qreg_c[0])
circuit.measure(qreg_b[0], creg_aORb[0])
circuit.measure(qreg_a[0], creg_aState[0])
circuit.measure(qreg_c[0], creg_bState[0])

in OpenQASM :

OPENQASM 2.0;
include "qelib1.inc";

qreg a[1];
qreg b[1];
qreg c[1];
creg aORb[1];
creg aState[1];
creg bState[1];

reset a[0];
reset b[0];
x b[0];
x c[0];
x a[0];
ccx a[0],b[0],c[0];
x a[0];
x b[0];
swap b[0],c[0];
measure b[0] -> aORb[0];
measure a[0] -> aState[0];
measure c[0] -> bState[0];

If we test we get a OR gate :

Here for | 0⟩ OR | 0⟩ you have good | 0⟩ on qubit B with a probability of 100%:

Here for | 0⟩ OR | 1⟩ you have good | 1⟩ on qubit B with a probability of 100%:

For | 1⟩ OR | 0⟩ you have good | 1⟩ on qubit B with a probability of 100%:

Finally for | 1⟩ OR | 1⟩ you have good | 1⟩ on qubit B with a probability of 100%:

## Quantum sort : Quantum sorting algorithm

To solve this problem we will first create a quantum circuit (used in Quantum sort algorithm) that allows you to measure if a figure is lower than another for this the intuition is to be based on a circuit similar to a classic circuit of comparison of magnitude at 2 qubit .

###### Quantum magnitude comparator for Quantum sort algorithm:

We consider a digit A stored in a 2 qubit register (A0, A1) and a digit B stored in another register (B0, B1):

To know if A> B with 2 registers of 2 Qubits, the equation must be applied:

The equivalent circuit to create this is:

Here in the example the 2 bits compared by this circuit are in register a and register B

The equivalent code for this comparator in OpenQASM is:

I have created a GATE comparator4 which has the following OPENQASM equivalent code:

gate comparator4 a, b, c, d, f, g, i, j, k, l {
x d;
x j;
x l;
andG1 b, d, f;
andG2 a, b, g;
andG3 a, f, i;
orG1 b, f, j;
x c;
andG4 c, f, k;
x c;
orG2 d, f, l;
swap d, i;
x d;
swap b, g;
}

So we see that this custom quantum gate is made up of a quantum AND gate and a custom quantum OR gate that I created in the following way. If we unfold the doors andG1, andG2, andG3, orG1, orG2 we have:

The OpenQASM equivalent code to write the different gate andG1, andG2, andG3, orG1, orG2 to create comparator 4 is:

gate andG1 a, b, c {
ccx a, b, c;
swap b, c;
}

gate andG2 a, b, c {
ccx a, b, c;
swap b, c;
}

gate andG3 a, b, c {
ccx a, b, c;
swap b, c;
}

gate orG1 a, b, c {
x a;
x b;
ccx a, b, c;
x a;
x b;
swap b, c;
}

gate andG4 a, b, c {
ccx a, b, c;
swap b, c;
}

gate orG2 a, b, c {
x a;
x b;
ccx a, b, c;
x a;
x b;
swap b, c;
}
###### Circuit for checking a sorted list

If we consider here a list of 4 digits stored in the 4 registers of 2 qubits: a, b, m, n. The circuit above is used to test whether the input list is sorted or not. For this I cascaded 3 quantum magnitude comparators (seen in detail in the previous paragraph). Thus the circuit obtained will test if a> b and put the result on the qbit c then check if b> m and store the result on the qbit o and finally it tests if m> n and store the result on the qubit q.

The last quantum gate andOnCompare makes it possible to apply a mutliple AND part: if c AND o AND q are in the quantum state | 1> then the qubit q is set in the quantum state | 1> otherwise it is in the quantum | 0>:

Below is the code of the door andOnCompare in openQASM:

gate andOnCompare a, b, c, d, f {
ccx a, b, d;
swap b, d;
ccx b, c, f;
swap c, f;
}

We have thus created a verification circuit that a list of 4 digits stored in registers a, b, m, n is sorted in a decreasing way or not. The result is stored in the qubit q which has a 100% chance of being measured at state | 1> if the list is sorted otherwise 100% to be at | 0> if the list a, b, m, n is not in decreasing order.

In the general diagram above we notice that in the example:

a=0,b=2,m=3,n=1

in this case this circuit will return the quantum state | 0> on the qubit q

###### Creation of a circuit of permutations

The idea of ​​our algorithm is to permute these 4 digits using control qubits which will be in superimposed state.

The permutation circuit is as follows:

A register of 6 Qubit “control” allows to control the permutation of the list knowing that there are for 4 elements: 4 * 3 * 2 * 1 possible permutations that is to say 24 possibilities the 6 qubits allow to generate these different states.

4 Hadamard quantum gates are used to implement these permutations and take advantage of the superposition of the Qubit ‘control’ register

In other words in the circuit above we see that we start the circuit by initializing the registers a, b, m, n with its list of unsorted numbers, here (a = 0, b = 2, m = 3, n = 1).

The permutation circuit which is controlled by a superimposed qubit register allows the list to be permuted in quantum superposition.

Thus we obtain on the qubit q0 the quantum state | 1> if the permuted list is sorted in a decreasing way otherwise the quantum state | 0>.

The OpenQASM code of the quantum algorithm for performing permutations is as follows:

gate permuter a, b, c, d, f, g, i, j, k, l, m, n, o, p {
cswap k, a, c;
cswap l, a, f;
cswap m, a, i;
cswap k, b, d;
cswap l, b, g;
cswap m, b, j;
cswap n, c, f;
cswap o, c, i;
cswap n, d, g;
cswap o, d, j;
cswap p, f, i;
cswap p, g, j;
}
###### Amplification of the result

I’m going to call this part of my algorithm amplification: the idea is to put the qubit a, b, m, n at | 00>, | 00>, | 00>, | 00> if the list is unsorted. If the list is sorted then the quantum state of a, b, m, n should not be changed.

So we should measure the following result: | 00>, | 00>, | 00>, | 00> when the list is unsorted and a, b, m, n with the input digits in sorted order (in the case in our example this means that there is a probability of measuring the state | 11>, | 10>, | 01>, | 00> either 3,2,1,0 or the sorted list).

The circuit of the quantum amplifier is as follows:

The circuit is based on a series of quantum gate AND applied to the output state q0 AND the number of the list. When the qubit q0 indicates that the list is sorted it is at | 1> and leaves the state of the qubits a, b, m, n at their input state (or permuted by superposition). When the qubit q0 indicates that the list is unsorted it is in the state | 0> the application of the AND gate will thus set the qubits a, b, m, n to | 0>.

The amplifier code is as follows according to a quantum algorithm encoded in OpenQASM:

gate amplifier2 a, b, c, d, f, g, i, j, k, l, m, n, o, p, q, r, v {
ccx a, k, l;
swap a, l;
ccx b, k, m;
swap b, m;
ccx c, k, n;
swap c, n;
ccx d, k, o;
swap d, o;
ccx f, k, p;
swap f, p;
ccx g, k, q;
swap g, q;
ccx i, k, r;
swap i, r;
ccx j, k, v;
swap j, v;
}
###### Complete quantum algorithm for descending sorting of a list of four digits:

Here is the circuit used to sort the following list: a = 0, b = 2, m = 3, n = 1.

It is easily possible to initialize the circuit with another list by modifying the initialization of registers a, b, m, n. Also the current algorithm allows to sort numbers coded on 2 qubits it remains possible to modify the algorithm according to the same principle to sort numbers stored on 3 qubits and consider a list of more than 4 elements (it is then necessary to multiply the number of qubit registers to store the entry list).

The complete code is as follows in openQASM:

OPENQASM 2.0;
include "qelib1.inc";
gate andG1 a, b, c {
ccx a, b, c;
swap b, c;
}

gate andG2 a, b, c {
ccx a, b, c;
swap b, c;
}

gate andG3 a, b, c {
ccx a, b, c;
swap b, c;
}

gate orG1 a, b, c {
x a;
x b;
ccx a, b, c;
x a;
x b;
swap b, c;
}

gate andG4 a, b, c {
ccx a, b, c;
swap b, c;
}

gate orG2 a, b, c {
x a;
x b;
ccx a, b, c;
x a;
x b;
swap b, c;
}

gate comparator a, b, c, d, f, g, i, j, k, l {
x d;
x j;
x l;
andG1 b, d, f;
andG2 a, b, g;
andG3 a, f, i;
orG1 b, f, j;
x c;
andG4 c, f, k;
orG2 d, f, l;
}

gate comparator2 a, b, c, d, f, g, i, j, k, l {
x d;
x j;
x l;
andG1 b, d, f;
andG2 a, b, g;
andG3 a, f, i;
orG1 b, f, j;
x c;
andG4 c, f, k;
orG2 d, f, l;
x d;
swap b, g;
}

gate comparator3 a, b, c, d, f, g, i, j, k, l {
x d;
x j;
x l;
andG1 b, d, f;
andG2 a, b, g;
andG3 a, f, i;
orG1 b, f, j;
x c;
andG4 c, f, k;
orG2 d, f, l;
swap d, i;
x d;
swap b, g;
}

gate comparator4 a, b, c, d, f, g, i, j, k, l {
x d;
x j;
x l;
andG1 b, d, f;
andG2 a, b, g;
andG3 a, f, i;
orG1 b, f, j;
x c;
andG4 c, f, k;
x c;
orG2 d, f, l;
swap d, i;
x d;
swap b, g;
}

gate permutation a, b, c, d, f, g, i, j, k, l, m, n, o, p {
ccx a, c, k;
ccx a, f, l;
ccx a, i, m;
ccx b, d, k;
ccx b, g, l;
ccx b, j, m;
ccx c, f, n;
ccx c, i, o;
ccx d, g, n;
ccx d, j, o;
ccx f, i, p;
ccx g, j, p;
}

gate andOnCompare a, b, c, d, f {
ccx a, b, d;
swap b, d;
ccx b, c, f;
swap c, f;
}

gate permuter a, b, c, d, f, g, i, j, k, l, m, n, o, p {
cswap k, a, c;
cswap l, a, f;
cswap m, a, i;
cswap k, b, d;
cswap l, b, g;
cswap m, b, j;
cswap n, c, f;
cswap o, c, i;
cswap n, d, g;
cswap o, d, j;
cswap p, f, i;
cswap p, g, j;
}

gate amplifier a, b, c, d, f, g, i, j, k, l, m, n, o, p, q, r, v {
ccx l, a, m;
swap a, m;
ccx l, b, n;
swap b, n;
ccx l, c, o;
swap c, o;
ccx l, d, p;
swap d, p;
ccx l, f, q;
swap f, q;
ccx l, f, q;
swap f, q;
ccx l, g, r;
swap g, r;
ccx l, i, v;
swap i, v;
ccx l, j, k;
swap j, k;
}

gate amplifier1 a, b, c, d, f, g, i, j, k, l, m, n, o, p, q, r, v {
ccx a, l, m;
swap a, m;
ccx b, l, n;
swap b, n;
ccx c, l, o;
swap c, o;
ccx d, l, p;
swap d, p;
ccx f, l, q;
swap f, q;
ccx f, l, q;
swap f, q;
ccx g, l, r;
swap g, r;
ccx i, l, v;
swap i, v;
ccx j, k, l;
swap j, k;
}

gate amplifier2 a, b, c, d, f, g, i, j, k, l, m, n, o, p, q, r, v {
ccx a, k, l;
swap a, l;
ccx b, k, m;
swap b, m;
ccx c, k, n;
swap c, n;
ccx d, k, o;
swap d, o;
ccx f, k, p;
swap f, p;
ccx g, k, q;
swap g, q;
ccx i, k, r;
swap i, r;
ccx j, k, v;
swap j, v;
}

qreg a[2];
qreg b[2];
qreg m[2];
qreg n[2];
qreg c[1];
qreg o[1];
qreg q[1];
qreg d[1];
qreg g[1];
qreg j[1];
qreg k[1];
qreg l[1];
qreg r[1];
qreg uu[1];
qreg zz[1];
qreg control[6];
creg sorted[8];

reset a[0];
reset a[1];
reset b[0];
reset b[1];
reset m[0];
reset m[1];
reset n[0];
reset n[1];
reset c[0];
reset o[0];
reset q[0];
reset d[0];
reset g[0];
reset j[0];
reset k[0];
reset l[0];
reset r[0];
reset uu[0];
h control[0];
h control[1];
h control[2];
h control[3];
h control[4];
h control[5];
reset a[0];
reset a[1];
reset b[0];
x b[1];
x m[0];
x m[1];
reset n[1];
reset c[0];
reset d[0];
reset g[0];
reset j[0];
reset k[0];
reset l[0];
x n[0];
permuter a[0],a[1],b[0],b[1],m[0],m[1],n[0],n[1],control[0],control[1],control[2],control[3],control[4],control[5];
comparator4 a[0],a[1],b[0],b[1],c[0],d[0],g[0],j[0],k[0],l[0];
reset zz[0];
reset d[0];
reset g[0];
reset j[0];
reset k[0];
reset l[0];
comparator4 b[0],b[1],m[0],m[1],o[0],d[0],g[0],j[0],k[0],l[0];
reset d[0];
reset g[0];
reset j[0];
reset k[0];
reset l[0];
comparator4 m[0],m[1],n[0],n[1],q[0],d[0],g[0],j[0],k[0],l[0];
andOnCompare c[0],o[0],q[0],r[0],uu[0];
reset d[0];
reset g[0];
reset j[0];
reset k[0];
reset l[0];
reset r[0];
reset uu[0];
amplifier2 a[0],a[1],b[0],b[1],m[0],m[1],n[0],n[1],q[0],d[0],g[0],j[0],k[0],l[0],r[0],uu[0],zz[0];
measure a[0] -> sorted[0];
measure a[1] -> sorted[1];
measure b[0] -> sorted[2];
measure b[1] -> sorted[3];
measure m[0] -> sorted[4];
measure m[1] -> sorted[5];
measure n[0] -> sorted[6];
measure n[1] -> sorted[7];

Here is the Python Qiskit code for my quantum sorting algorithm:

from qiskit import QuantumRegister, ClassicalRegister, QuantumCircuit
from numpy import pi

qreg_a = QuantumRegister(2, 'a')
qreg_b = QuantumRegister(2, 'b')
qreg_m = QuantumRegister(2, 'm')
qreg_n = QuantumRegister(2, 'n')
qreg_c = QuantumRegister(1, 'c')
qreg_o = QuantumRegister(1, 'o')
qreg_q = QuantumRegister(1, 'q')
qreg_d = QuantumRegister(1, 'd')
qreg_g = QuantumRegister(1, 'g')
qreg_j = QuantumRegister(1, 'j')
qreg_k = QuantumRegister(1, 'k')
qreg_l = QuantumRegister(1, 'l')
qreg_r = QuantumRegister(1, 'r')
qreg_uu = QuantumRegister(1, 'uu')
qreg_zz = QuantumRegister(1, 'zz')
qreg_control = QuantumRegister(6, 'control')
creg_sorted = ClassicalRegister(8, 'sorted')
circuit = QuantumCircuit(qreg_a, qreg_b, qreg_m, qreg_n, qreg_c, qreg_o, qreg_q, qreg_d, qreg_g, qreg_j, qreg_k, qreg_l, qreg_r, qreg_uu, qreg_zz, qreg_control, creg_sorted)

circuit.x(qreg_b[1])
circuit.x(qreg_m[0])
circuit.x(qreg_m[1])
circuit.x(qreg_n[0])
circuit.x(qreg_j[0])
circuit.x(qreg_l[0])
circuit.h(qreg_control[0])
circuit.cswap(qreg_control[0], qreg_a[0], qreg_b[0])
circuit.cswap(qreg_control[0], qreg_a[1], qreg_b[1])
circuit.h(qreg_control[1])
circuit.cswap(qreg_control[1], qreg_a[0], qreg_m[0])
circuit.cswap(qreg_control[1], qreg_a[1], qreg_m[1])
circuit.h(qreg_control[2])
circuit.cswap(qreg_control[2], qreg_a[0], qreg_n[0])
circuit.cswap(qreg_control[2], qreg_a[1], qreg_n[1])
circuit.h(qreg_control[3])
circuit.cswap(qreg_control[3], qreg_b[0], qreg_m[0])
circuit.cswap(qreg_control[3], qreg_b[1], qreg_m[1])
circuit.h(qreg_control[4])
circuit.cswap(qreg_control[4], qreg_b[0], qreg_n[0])
circuit.x(qreg_b[0])
circuit.cswap(qreg_control[4], qreg_b[1], qreg_n[1])
circuit.x(qreg_b[1])
circuit.ccx(qreg_a[1], qreg_b[1], qreg_c[0])
circuit.ccx(qreg_a[0], qreg_a[1], qreg_d[0])
circuit.swap(qreg_a[1], qreg_d[0])
circuit.x(qreg_a[1])
circuit.swap(qreg_b[1], qreg_c[0])
circuit.ccx(qreg_a[0], qreg_c[0], qreg_g[0])
circuit.swap(qreg_c[0], qreg_g[0])
circuit.x(qreg_c[0])
circuit.ccx(qreg_a[1], qreg_c[0], qreg_j[0])
circuit.x(qreg_c[0])
circuit.swap(qreg_c[0], qreg_j[0])
circuit.reset(qreg_j[0])
circuit.x(qreg_j[0])
circuit.x(qreg_a[1])
circuit.swap(qreg_a[1], qreg_d[0])
circuit.reset(qreg_d[0])
circuit.ccx(qreg_b[0], qreg_c[0], qreg_k[0])
circuit.swap(qreg_c[0], qreg_k[0])
circuit.x(qreg_c[0])
circuit.reset(qreg_k[0])
circuit.x(qreg_b[0])
circuit.x(qreg_b[1])
circuit.ccx(qreg_b[1], qreg_c[0], qreg_l[0])
circuit.x(qreg_c[0])
circuit.swap(qreg_c[0], qreg_l[0])
circuit.reset(qreg_l[0])
circuit.x(qreg_l[0])
circuit.x(qreg_b[1])
circuit.swap(qreg_b[1], qreg_g[0])
circuit.reset(qreg_g[0])
circuit.x(qreg_b[1])
circuit.h(qreg_control[5])
circuit.cswap(qreg_control[5], qreg_m[0], qreg_n[0])
circuit.x(qreg_m[0])
circuit.x(qreg_n[0])
circuit.cswap(qreg_control[5], qreg_m[1], qreg_n[1])
circuit.x(qreg_m[1])
circuit.ccx(qreg_b[1], qreg_m[1], qreg_o[0])
circuit.ccx(qreg_b[0], qreg_b[1], qreg_d[0])
circuit.swap(qreg_b[1], qreg_d[0])
circuit.x(qreg_b[1])
circuit.swap(qreg_m[1], qreg_o[0])
circuit.ccx(qreg_b[0], qreg_o[0], qreg_g[0])
circuit.swap(qreg_o[0], qreg_g[0])
circuit.x(qreg_o[0])
circuit.ccx(qreg_b[1], qreg_o[0], qreg_j[0])
circuit.x(qreg_o[0])
circuit.swap(qreg_o[0], qreg_j[0])
circuit.reset(qreg_j[0])
circuit.x(qreg_j[0])
circuit.x(qreg_b[1])
circuit.swap(qreg_b[1], qreg_d[0])
circuit.reset(qreg_d[0])
circuit.ccx(qreg_m[0], qreg_o[0], qreg_k[0])
circuit.swap(qreg_o[0], qreg_k[0])
circuit.reset(qreg_k[0])
circuit.x(qreg_o[0])
circuit.x(qreg_m[0])
circuit.x(qreg_m[1])
circuit.ccx(qreg_m[1], qreg_o[0], qreg_l[0])
circuit.x(qreg_o[0])
circuit.swap(qreg_o[0], qreg_l[0])
circuit.ccx(qreg_c[0], qreg_o[0], qreg_r[0])
circuit.reset(qreg_l[0])
circuit.x(qreg_l[0])
circuit.swap(qreg_o[0], qreg_r[0])
circuit.reset(qreg_r[0])
circuit.x(qreg_m[1])
circuit.swap(qreg_m[1], qreg_g[0])
circuit.reset(qreg_g[0])
circuit.x(qreg_m[1])
circuit.x(qreg_n[1])
circuit.ccx(qreg_m[1], qreg_n[1], qreg_q[0])
circuit.ccx(qreg_m[0], qreg_m[1], qreg_d[0])
circuit.swap(qreg_m[1], qreg_d[0])
circuit.x(qreg_m[1])
circuit.swap(qreg_n[1], qreg_q[0])
circuit.ccx(qreg_m[0], qreg_q[0], qreg_g[0])
circuit.swap(qreg_q[0], qreg_g[0])
circuit.x(qreg_q[0])
circuit.ccx(qreg_m[1], qreg_q[0], qreg_j[0])
circuit.x(qreg_q[0])
circuit.swap(qreg_q[0], qreg_j[0])
circuit.reset(qreg_j[0])
circuit.x(qreg_m[1])
circuit.swap(qreg_m[1], qreg_d[0])
circuit.reset(qreg_d[0])
circuit.ccx(qreg_n[0], qreg_q[0], qreg_k[0])
circuit.swap(qreg_q[0], qreg_k[0])
circuit.reset(qreg_k[0])
circuit.x(qreg_q[0])
circuit.x(qreg_n[0])
circuit.x(qreg_n[1])
circuit.ccx(qreg_n[1], qreg_q[0], qreg_l[0])
circuit.x(qreg_q[0])
circuit.swap(qreg_q[0], qreg_l[0])
circuit.reset(qreg_l[0])
circuit.ccx(qreg_o[0], qreg_q[0], qreg_uu[0])
circuit.swap(qreg_q[0], qreg_uu[0])
circuit.reset(qreg_uu[0])
circuit.ccx(qreg_a[0], qreg_q[0], qreg_d[0])
circuit.swap(qreg_a[0], qreg_d[0])
circuit.x(qreg_n[1])
circuit.swap(qreg_n[1], qreg_g[0])
circuit.reset(qreg_g[0])
circuit.ccx(qreg_a[1], qreg_q[0], qreg_g[0])
circuit.swap(qreg_a[1], qreg_g[0])
circuit.ccx(qreg_b[0], qreg_q[0], qreg_j[0])
circuit.swap(qreg_b[0], qreg_j[0])
circuit.ccx(qreg_b[1], qreg_q[0], qreg_k[0])
circuit.swap(qreg_b[1], qreg_k[0])
circuit.ccx(qreg_m[0], qreg_q[0], qreg_l[0])
circuit.swap(qreg_m[0], qreg_l[0])
circuit.ccx(qreg_m[1], qreg_q[0], qreg_r[0])
circuit.swap(qreg_m[1], qreg_r[0])
circuit.ccx(qreg_n[0], qreg_q[0], qreg_uu[0])
circuit.swap(qreg_n[0], qreg_uu[0])
circuit.x(qreg_n[1])
circuit.ccx(qreg_n[1], qreg_q[0], qreg_zz[0])
circuit.swap(qreg_n[1], qreg_zz[0])
circuit.measure(qreg_a[0], creg_sorted[0])
circuit.measure(qreg_a[1], creg_sorted[1])
circuit.measure(qreg_b[0], creg_sorted[2])
circuit.measure(qreg_b[1], creg_sorted[3])
circuit.measure(qreg_m[0], creg_sorted[4])
circuit.measure(qreg_m[1], creg_sorted[5])
circuit.measure(qreg_n[0], creg_sorted[6])
circuit.measure(qreg_n[1], creg_sorted[7])

## Quantum sorting algorithm circuit test with IBM qiskit (python)

The tests are carried out by starting the algorithm on the quantum simulator:

To start the quantum algorithm, you must first initialize:

%matplotlib inline
# Importing standard Qiskit libraries
from qiskit import QuantumCircuit, execute, Aer, IBMQ
from qiskit.compiler import transpile, assemble
from qiskit.tools.jupyter import *
from qiskit.visualization import *
from iqx import *

provider = IBMQ.load_account()

And to perform the test for example by executing the quantum circuit 100 times and displaying the probability of measurement in the form of a histogram:

backend = Aer.get_backend('qasm_simulator')
results = execute(circuit, backend=backend, shots=100).result()
plot_histogram(answer)

We obtain the result which confirms that with Quantum sort algorithm we have more change in measuring the quantum state of the sorted list than the other combinations (in my test 5%):

After 200 shots i get :

## Quantum Sort & Internal links :

http://128mots.com/index.php/en/category/non-classe-en/

## Introduction

When I wrote this blog post (this Pytorch tutorial), I remembered the challenge I set for myself at the beginning of the year to learn deep learning, I did not even know Python at the time. What makes things difficult is not necessarily the complexity of the concepts, but it starts with questions like: What framework to use for deep learning? Which activation function should I choose? Which cost function is best suited for my problem?

My personal experience has been to study the PyTorch framework and especially for the theory the online course made available for free by Yan LeCun whom I thank (link here https://atcold.github.io/pytorch-Deep-Learning/) . I still have to learn and work in the field but through this blog post, I would like to share and give you an overview of what I have learned about deep learning this year.

## Deep Learning: Which activation and cost function to choose?

The objective of this article is to sweep through this central topic in DeepLearning of choosing the activation and cost (loss) function according to the problem you are looking to solve by means of a DeepLearning algorithm.

We are talking about the activation function of the last layer of your model, that is to say the one that gives you the result.

This result is used in the algorithm which checks for each learning the difference between the result predicted by the neural network and the real result (gradient descent algorithm), for this we apply to the result a cost function (loss function) which represents this difference and which you seek to minimize as you practice in order to reduce the error. (See this article here to learn more : https://machinelearnia.com/descente-de-gradient/)

As a reminder, the machine learns by minimizing the cost function, iteratively by successive training steps, the result of the cost function and taken into account for the adjustment of the parameters of the neurons (weight and bias for example for linear layers) .

This choice of activation function on the last layer and the cost function to be minimized (loss function) is therefore crucial since these two elements combined will determine how you are going to solve a problem using your neural network.

This article also presents simple examples that use the Pytorch framework in my opinion a very good tool for machine learning.

The question here to ask is the following:

• Am I looking to create a model that performs binary classification? (In this case you are trying to predict a probability that the result of an entry will be 0 or 1)
Am I looking to calculate / predict a numeric value with my neural network? (In this case you are trying to predict a decimal value for example at the output that corresponds to your input)
• Am I looking to create a model that performs single or multiple label classification for a set of classes? (In this case, you are trying to predict for each output class the probability that an input matches this class)
• Am I looking to create a model that searches for multiple classes within a set of possible classes? (In this case, you are trying to predict for each class at the exit its attendance rate at the entrance).

### Binary classification problem:

You are trying to predict by means of your DeepLearning algorithm whether a result is true or false, and more precisely it is a probability that a result of 1 or that of a result of 0 that you will get.

The output size of your neural network is 1 (final layer) and you seek to obtain a result between 0 and 1 which will be assimilated to the probability that the result is 1 (example at the output of the network if you obtain 0.65 this will correspond to 65% chance that the result is true).
Application example: Predicting the probability that the home team in a football match will win. The closer the value is to 1 the more the home team has a chance of winning, and conversely the closer the score is to 0 the more chance the home team has of losing.
It is possible to qualify this result using a threshold, for example by admitting that if the output is> 0.5 then the result is true if not false.

##### Final activation function (case of binary classification):

The final activation function should return a result between 0 and 1, the correct choice in this case may be the sigmoid function. The sigmoid function will easily translate a result between 0 and 1 and therefore is ideal for translating the probability that we are trying to predict.

import math
import numpy as np
import matplotlib.pyplot as plt
def sigmoid(x):
a = []
for item in x:
a.append(1/(1+math.exp(-item)))
return a
x = np.arange(-10., 10., 0.2)
sig = sigmoid(x)
plt.plot(x,sig)
plt.show()
##### The cost function – Loss function (case of binary classification):

You have to determine during training the difference between the probability that the model predicts (translated via the final sigmoid function) and the true and known response (0 or 1). The function to use is Binary Cross Entrropy because this function allows to calculate the difference between 2 probability.
The optimizer will then use this result to adjust the weights and biases in your model (or other parameters depending on the architecture of your model).

###### Example :

In this example I will create a neural network with 1 linear layer and a final sigmoid activation function.

First, I perform all the imports that we will use in this post:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from collections import OrderedDict
import math
from random import randrange
import torch
import torch.nn as nn
from torch.nn.parameter import Parameter
from torch import optim
import torch.nn.functional as F
from torchvision import datasets, transforms
from echoAI.Activation.Torch.bent_id import BentID
from sklearn.model_selection import train_test_split
import sklearn.datasets
from sklearn.metrics import accuracy_score
import hiddenlayer as hl
import warnings
warnings.filterwarnings("ignore")
from torchviz import make_dot, make_dot_from_trace

The following lines allow you not to display the warnings in python:

import warnings
warnings.filterwarnings("ignore")

I will generate 1000 samples generated by the sklearn library allowing me to have a set of tests for a binary classification problem. The test set covers 2 decimal inputs and a binary output (0 or 1). I also display the result as a point cloud:

inputData,outputData = sklearn.datasets.make_moons(1000,noise=0.3)
plt.scatter(inputData[:,0],inputData[:,1],s=40,c=outputData,cmap = plt.cm.get_cmap("Spectral"))

The test set generated here corresponds to a data set with 2 classes (class 0 and 1).

The goal of my neural network is therefore a binary classification of the input.

I will take 20% of this test set as test data and the remaining 80% as training data. I split the test set again to create a validation set on the training dataset.

X_train, X_test, y_train, y_test = train_test_split(inputData, outputData, test_size=0.20, random_state=42)

In my example I’m going to create a network with any architecture what interests us is to show how the last layer works:

fc1 = nn.Linear(2, 2)
fc2 = nn.Linear(2, 1)

model = nn.Sequential(OrderedDict([
('lin1', fc1),
('BentID1', BentID()),
('lin2', fc2),
('sigmoid', nn.Sigmoid())
]))
model = model.double()

I added a line to transform the pytorch model to use the double type. This avoids an error of the type:

Expected object of scalar type Double but got scalar type Float for argument

Also it will be necessary to use dtype = torch.double when creating torch.tensor at each training.

As indicated in my documents above the cost function is the Binary Cross Entropy function in this case (BCELoss : https://pytorch.org/docs/stable/generated/torch.nn.BCELoss.html#torch.nn.BCELoss) :

loss = nn.BCELoss()

I am using the Adam optimizer with a learning rate of 0.01:
Deep learning optimizers are algorithms or methods used to modify attributes of your neural network such as weights and bias in such a way that the loss function is minimized.

The learning rate is a hyperparameter that controls how much the model should be changed in response to the estimated error each time the model weights are updated.

Here I am using the Adam optimization algorithm. In Deep Learning Adam is a stochastic gradient descent method that calculates individual adaptive learning rates for different parameters from estimates of the first and second order moments of the gradients.

optimizer = optim.Adam(model.parameters(), lr=0.01)


I previously split the training data a second time to take 20% of the data as validation data. They allow us at each training to validate that we are not doing over-training (over-adjustment, or over-interpretation, or simply in English overfitting).

X_train, X_valid, y_train, y_valid = train_test_split(X_train, y_train, test_size=0.20, random_state=42)


In this example I have chosen to implement the EarlyStopping algorithm with a patience of 5. This means that if the cost function of the validation data increases during 15 training sessions (ie the distance between the prediction and the true data). After 15 training sessions with an increasing distance, we reload the previous data which represents the configuration of the neural network producing a distance of the minimum cost function (in the case of our example this consists in reloading the weight and the bias for the 2 layers linear). As long as the function decreases, the configuration of the neural network is saved (weight and bias of the linear layers in our example).

Here I have chosen to display the weights and bias of the 2 linear layers every 10 workouts, to give you an idea of how the Adam optimization algorithm works:

I used https://github.com/waleedka/hiddenlayer to display various graphs around training and validation metrics. The graphics are thus refreshed during use.

history1 = hl.History()
canvas1 = hl.Canvas()

Here is the main code of the learning loop:

bestvalidloss = np.inf
saveWeight1 = fc1.weight
saveWeight2 = fc2.weight
saveBias1 = fc1.bias
saveBias2 = fc2.bias
wait = 0
for epoch in range(1000):
model.train()
result = model(torch.tensor(X_train,dtype=torch.double))
lossoutput = loss(result, torch.tensor(y_train,dtype=torch.double))
lossoutput.backward()
optimizer.step()
print("EPOCH " + str(epoch) + "- train loss = " + str(lossoutput.item()))
if((epoch+1)%10==0):
#AFFICHE LES PARAMETRES DU RESEAU TOUT LES 10 entrainements
print("*************** PARAMETERS WEIGHT & BIAS *********************")
print("weight linear1= " + str(fc1.weight))
print('Bias linear1=' + str(fc1.bias))
print("weight linear2= " + str(fc2.weight))
print('bias linear2=' + str(fc2.bias))
print("**************************************************************")
model.eval()
validpred = model(torch.tensor(X_valid,dtype=torch.double))
validloss = loss(validpred, torch.tensor(y_valid,dtype=torch.double))
print("EPOCH " + str(epoch) + "- valid loss = " + str(validloss.item()))

# Store and plot train and valid loss.
history1.log(epoch, trainloss=lossoutput.item(),validloss=validloss.item())
canvas1.draw_plot([history1["trainloss"], history1["validloss"]])

if(validloss.item() < bestvalidloss):
bestvalidloss = validloss.item()
#
saveWeight1 = fc1.weight
saveWeight2 = fc2.weight
saveBias1 = fc1.bias
saveBias2 = fc2.bias
wait = 0
else:
wait += 1
if(wait > 15):
#Restauration des mailleurs parametre et early stopping
fc1.weight = saveWeight1
fc2.weight = saveWeight2
fc1.bias = saveBias1
fc2.bias = saveBias2
print("##############################################################")
print("stop valid because loss is increasing (EARLY STOPPING) afte EPOCH=" + str(epoch))
print("BEST VALID LOSS = " + str(bestvalidloss))
print("BEST PARAMETERS WHEIGHT AND BIAS = ")
print("FIRST LINEAR : WEIGHT=" + str(saveWeight1) + " BIAS = " + str(saveBias1))
print("SECOND LINEAR : WEIGHT=" + str(saveWeight2) + " BIAS = " + str(saveBias2))
print("##############################################################")
break
else:
continue

Here is an overview of the result, training stops after about 200 epochs (an “epoch” is a term used in machine learning to refer to a passage of the complete training data set). In my example I didn’t use a batch to load the data so the epoch count is the iteration count.

Evolution of the cost function on the training test and the validation test

Here’s a look at the prediction results:

result = model(torch.tensor(X_test,dtype=torch.double))
plt.scatter(X_test[:,0],X_test[:,1],s=40,c=y_test,cmap = plt.cm.get_cmap("Spectral"))
plt.title("True")
plt.colorbar()
plt.show()
plt.scatter(X_test[:,0],X_test[:,1],s=40,c=result.data.numpy(),cmap = plt.cm.get_cmap("Spectral"))
plt.title("predicted")
plt.colorbar()
plt.show()

To display the architecture of the neural network I used the hidden layer library which requires the installation of graphviz to avoid the following error:

RuntimeError: Make sure the Graphviz executables are on your system's path

On Mac OS X, for example, I installed via Homebrew:

brew install graphviz

Here is an overview of the hidden layer graph:

# HiddenLayer graph
hl.build_graph(model, torch.zeros(2,dtype=torch.double))

I also used PyTorchViz https://github.com/szagoruyko/pytorchviz/blob/master/examples.ipynb to display a graph of Pytorch operations during the forward of an entry and thus we can clearly see what will happen pass during the backward (ie the application the calculation of the dols / dx differential for all the parameters x of the network which has requires_grad = True).

make_dot(model(torch.ones(2,dtype=torch.double)), params=dict(model.named_parameters()))

### Regression problem (numeric / decimal value calculation):

In this case, you are trying to predict a continuous numerical quantity using your DeepLearning algorithm.
In this case, the gradient descent algorithm consists in comparing the difference between the numerical value predicted by the network and the true value.
The output size of the network will be 1 since there is only one value to predict.

##### Final activation function (decimal or numeric value case):

The activation function to use in this case depends on the range in which your data is located.

For a data between -infinite and + infinite then you can use a linear function at the output of your network.

Linear function graph function. https://pytorch.org/docs/stable/generated/torch.nn.Linear.html. The particularity is that this linear function (of the type y = ax + b) contains learnable parameters (weight and bias which are modified by the optimizer over the training sessions, i.e. y = weight * x + bias).

You can also use the Relu function if your value to predict is strictly positive
the output of ReLu is the maximum value between zero and the input value. An output is zero when the input value is negative and the input value when the input is positive.
Note that unlike the rectified linear activation function Relu does not have an adjustable parameter (learnable).

https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html

Another possible function is PReLu (parametric linear rectification unit):

Also another possible final activation function is Bent identity.

Finally I recommend the possible use of Parametric Soft Exponential, I use the echoAi implementation because it is not natively in PyTorch

##### The cost function (regression case, calculation of numerical value):

The cost function which makes it possible to determine the distance between the predicted value and the real value is the average of the squared distance between the 2 predictions. The function to use is mean squared error (MSE): https://pytorch.org/docs/stable/generated/torch.nn.MSELoss.html

Here is a simple example that illustrates the use of each of the final functions:

I started off with the example of house prices which I then adapted to test with each of the functions.

First, I import the necessary libraries:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from collections import OrderedDict
import math
from random import randrange
import torch
import torch.nn as nn
from torch.nn.parameter import Parameter
from torch import optim
import torch.nn.functional as F
from torchvision import datasets, transforms
from echoAI.Activation.Torch.bent_id import BentID
from echoAI.Activation.Torch.soft_exponential import SoftExponential
from sklearn.model_selection import train_test_split
import sklearn.datasets
from sklearn.metrics import accuracy_score
import hiddenlayer as hl
import warnings
warnings.filterwarnings("ignore")
from torchviz import make_dot, make_dot_from_trace
from sklearn.datasets import make_regression
from torch.autograd import Variable

I create the dataset which is quite simple and display it:

house_prices_array = [30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140]
house_price_np = np.array(house_prices_array, dtype=np.float32)
house_price_np = house_price_np.reshape(-1,1)
house_price_tensor = Variable(torch.from_numpy(house_price_np))
house_size = [ 7.5, 7, 6.5, 6.0, 5.5, 5.0, 4.5,3.5,3.2,2.8,3.0,2.5]
house_size_np = np.array(house_size, dtype=np.float32)
house_size_np = house_size_np.reshape(-1, 1)
house_size_tensor = Variable(torch.from_numpy(house_size_np))
import matplotlib.pyplot as plt
plt.scatter(house_prices_array, house_size_np)
plt.xlabel("House Price $") plt.ylabel("House Sizes") plt.title("House Price$ VS House Size")
plt.show()

In the example, we see that the function to find is close to
f (x) = – 0.05 * x + 9
Example: – 0.05 * 40 + 9 = 7 and -0.05 * 30 + 9 = 7.5

I set the bias and the weight to -0.01 and 8 to limit the training time.

###### Linear activation function (Solving regression problem):
fc1 = nn.Linear(1, 1)

model = nn.Sequential(OrderedDict([
('lin', fc1)
]))

model = model.float()
fc1.weight.data.fill_(-0.01)
fc1.bias.data.fill_(8)

I declare MSELoss and an Adam optimizer tuned with a learning rate 0.01

loss = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)

Here is the training loop:

history1 = hl.History()
canvas1 = hl.Canvas()
for epoch in range(100):
model.train()
result = model(house_price_tensor)
lossoutput = loss(result, house_size_tensor)
lossoutput.backward()
optimizer.step()
print("EPOCH " + str(epoch) + "- train loss = " + str(lossoutput.item()))
history1.log(epoch, trainloss=lossoutput.item())
canvas1.draw_plot([history1["trainloss"]])

After a hundred training sessions we obtain an MSELoss = 0.13

If I display the weights and biases found by the model I get in my example:

print("weight" + str(fc1.weight))
print("bias" + str(fc1.bias))

The network therefore executes here the function f (x) = -0.0408 * x + 8.1321

I then display the predicted result to compare it with the actual result:

result = model(house_price_tensor)
plt.scatter(house_prices_array, house_size_np)
plt.title("True")
plt.show()
plt.scatter(house_prices_array,result.detach().numpy())
plt.title("predicted")
plt.show()

Here is a preview of the architecture diagram for a simple Pytorch neuron network with Linear function, we see the multiplication operator and the addition operator to execute y = ax + b.

hl.build_graph(model, torch.ones(1,dtype=torch.float))

###### ReLu activation function (Regression problem solving):

ReLu is not an activation function with “learnable” parameters (modified by the optimizer) so I add a linear layer upstream for the test:

The result is identical, which is logical since in my case reLu only passes the result which is strictly positive

In the previous example, a ReLu layer must be added at the output after the linear layer:

fc1 = nn.Linear(1, 1)

model = nn.Sequential(OrderedDict([
('lin', fc1),
('relu',nn.ReLU())
]))

model = model.float()
fc1.weight.data.fill_(-0.01)
fc1.bias.data.fill_(8)

The same for PrRelu in my case:
These two functions simply make a flat pass between the linear function and the output. ReLu remains interesting if you want to set to 0 all the outputs concerning a negative input.

###### Bent Identity Function (Regression Problem Solving):

Bent identity there is no learning parameter but the function does not just forward the input it applies the curved identity function to it which slightly modifies the result:

fc1 = nn.Linear(1, 1)

model = nn.Sequential(OrderedDict([
('lin', fc1),
('BentID',BentID())
]))

model = model.float()
fc1.weight.data.fill_(-0.01)
fc1.bias.data.fill_(8)

In our case after 100 training sessions the network found the following weights for the linear function upstream of bent identity the final loss is 0.78 so less good than with the linear function (the network converges less quickly):

In our case, the function applied by the network will therefore be:

-0.0481 ((math.sqt(x**2 + 1) -1)/2 + x) + 7.7386

Result obtained with Bent Identity:

hl.build_graph(model, torch.ones(1,dtype=torch.float))

I display the architecture of the Pytorch network which is now linear layer + bent identity layer :

Also :

make_dot(model(torch.ones(1,dtype=torch.float)), params=dict(model.named_parameters()))
###### Soft Exponential – (Regression problem solving)

Soft Exponential has a trainable alpha parameter.We are going to place a linear function upstream and softExponential in output and check the approximation performed in this case:

fc1 = nn.Linear(1, 1)
fse = SoftExponential(1,torch.tensor([-0.9],dtype=torch.float))
model = nn.Sequential(OrderedDict([
('lin', fc1),
('SoftExponential',fse)
]))

model = model.float()
fc1.weight.data.fill_(-0.01)
fc1.bias.data.fill_(8)

After 100 training sessions the results on the linear layer parameters and the soft exponential layer parameters are as follows:

print("weight" + str(fc1.weight))
print("bias" + str(fc1.bias))
print("fse alpha" + str(fse.alpha))

The result is not really good since the approximation is not done in the right direction,

We therefore notice that we could invert the function in our case to define a Soft Exponential customize activation function with a bias criterion and that is what we will do here.

###### Custom function – (Regression problem solving)

We declare a custom activation function to PyTorch compared to the original soft exponential I added the beta bias criterion as well as the torch.div inversion (1, …)

class SoftExponential2(nn.Module):
def __init__(self, in_features, alpha=None,beta=None):
super(SoftExponential2, self).__init__()
self.in_features = in_features
# initialize alpha
if alpha is None:
self.alpha = Parameter(torch.tensor(0.0))  # create a tensor out of alpha
else:
self.alpha = Parameter(torch.tensor(alpha))  # create a tensor out of alpha
if beta is None:
self.beta = Parameter(torch.tensor(0.0))  # create a tensor out of alpha
else:
self.beta = Parameter(torch.tensor(beta))  # create a tensor out of alpha

def forward(self, x):
if self.alpha == 0.0:
return x

if self.alpha < 0.0:

if self.alpha > 0.0:
return torch.add(torch.div(1,((torch.exp(self.alpha * x) - 1) / self.alpha + self.alpha)),self.beta)

We now have 2 parameters that can be trained in this custom function in Pytorch.

By also lowering the learning rate to 0.01 after 100 training sessions and initializing alpha = 0 .1 and beta = 0.7 I arrive at a loss <5

fse = SoftExponential2(1,torch.tensor([0.1],dtype=torch.float),torch.tensor([7.0],dtype=torch.float))

model = nn.Sequential(OrderedDict([
('SoftExponential',fse)
]))

model = model.float()
fc1.weight.data.fill_(-0.01)
fc1.bias.data.fill_(8)
print("fse alpha" + str(fse.alpha))
print("fse beta" + str(fse.beta))

### Categorization problem (predict a class among several classes possible) – single-label classifier with pytorch

You are looking to predict the probability that your entry will match a single class and exit betting multiple possible classes.
For example you want to predict the type of vehicle on an image allowed the classes: car, truck, train

The dimensioning of your output network corresponds to n neurons for n classes. And for each output neurons corresponding to a possible class to be predicted, we will obtain a probability between 0 and 1 which will represent the probability that the input corresponds to this class at the output. So for each class this comes down to solving a binary classification problem in part 1 but in addition to which it must be considered that an input can only correspond to a single output class.

##### Activation function Categorization problem (predict a class among several possible classes):

The activation function to be used on the final layer is a Softfmax function with n dimensions corresponding to the number of classes to be predicted.
Softmax is a mathematical function that converts a vector of numbers (tensor) into a vector of probabilities, where the probabilities of each value are proportional to the relative scale of each value in the vector.

In other words, for an output tensor it will return a probability for each class by scaling each of them so that their sum is equal to one.

https://pytorch.org/docs/stable/generated/torch.nn.Softmax.html

##### Cost function – Categorization problem (predict a class among several possible classes):

The cost function to be used is close to that used in the case of binary classification but with this notion of probability vector.
Cross Entropy will evaluate the difference between 2 probability distributions. We will therefore use it to compare the predicted value and the true value.

Based on the tutorial and the data set on this page we have:https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html

For info if you get the following error:

ImportError: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html

You need to setup Iprogress on jupyter lab

pip install ipywidgets
jupyter nbextension enable --py widgetsnbextension


In our example we will first retrieve the input dataset

https://www.cs.toronto.edu/~kriz/cifar.html

This is already integrated with Pytorch to allow us to perform certain tests.

###### Exemple (problème de catégorisation) :

Preparation of the dataset:

import torch
import torchvision
import torchvision.transforms as transforms

transform = transforms.Compose(
[transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat',
'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

We then define a function which allows to visualize the imshow image and we display for each image the unique label associated with it:

import matplotlib.pyplot as plt
import numpy as np

# functions to show an image

def imshow(img):
img = img / 2 + 0.5     # unnormalize
npimg = img.numpy()
plt.imshow(np.transpose(npimg, (1, 2, 0)))
plt.show()

# get some random training images
images, labels = dataiter.next()

# show images
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))

A convolutional neural network (ConvNet or CNN) is a DeepLearning algorithm that can take an image as input, it sets a score (weight and bias which are learnable parameters) to various aspects / objects of the image and be able to differentiate one from the other.

The architecture of a convolutional neuron network is close to that of the model of neuron connectivity in the human brain and was inspired by the organization of the visual cortex.

Individual neurons respond to stimuli only in a restricted region of the visual field known as the receptive field. A collection of these fields overlap to cover the entire visual area.

In our case we are going to use with Pytorch a Conv2d layer and a pooling layer.

https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html : The Conv2d layer is the 2D convolution layer.

https://www.freecodecamp.org/news/an-intuitive-guide-to-convolutional-neural-networks-260c2de0a050/

A filter in a conv2D layer has a height and a width. They are often smaller than the input image. This filter therefore moves over the entire image during training (this area is called the receptive field).

The Max Pooling layer is a sampling process. The objective is to sub-sample an input representation (image for example), by reducing its size and by making assumptions on the characteristics contained in the grouped sub-regions.

In my example with PyTorch the declaration is made :

import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)

def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 5 * 5)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x

net = Net()

Define the criterion of the cost function with CrossEntropyLoss, here we use the SGD opimizer rather than adam (more info here on the comparison between optimizer https://ruder.io/optimizing-gradient-descent/)

import torch.optim as optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

The training loop:

for epoch in range(2):  # loop over the dataset multiple times

running_loss = 0.0
for i, data in enumerate(trainloader, 0):
# get the inputs; data is a list of [inputs, labels]
inputs, labels = data

# forward + backward + optimize
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()

# print statistics
running_loss += loss.item()
if i % 2000 == 1999:    # print every 2000 mini-batches
print('[%d, %5d] loss: %.3f' %
(epoch + 1, i + 1, running_loss / 2000))
running_loss = 0.0

print('Finished Training')

The Pytorch neuron network is saved

PATH = './cifar_net.pth'
torch.save(net.state_dict(), PATH)

We load a test and use the network to predict the outcome.

dataiter = iter(testloader)
images, labels = dataiter.next()

# print images
imshow(torchvision.utils.make_grid(images))
print('GroundTruth: ', ' '.join('%5s' % classes[labels[j]] for j in range(4)))

In this example the “true” labels are “Cat, ship, ship, plane”, we then launch the network to make a prediction:

net = Net()
outputs = net(images)
_, predicted = torch.max(outputs, 1)
print('Predicted: ', ' '.join('%5s' % classes[predicted[j]]
for j in range(4)))

Here is an overview of the architecture of this convolution network.

import hiddenlayer as hl
from torchviz import make_dot, make_dot_from_trace
make_dot(net(images), params=dict(net.named_parameters()))

### Categorization problem (predict several class among several classes possible) – multiple-label classifier with pytorch – Pytorch tutorial

Overall, it is about predicting several probabilities for each of the classes to indicate their probabilities of presence in the entry. One possible use is to indicate the presence of an object in an image.

The problem then comes back to a problem of binary classification for n classes.

The final activation function is sigmoid and the loss function is Binary cross entropy

This internet example perfectly illustrates the use of BCELoss in the case of the prediction of several classes among several possible classes.

In this example: The image dataset used is the CelebFaces Large Scale Attribute Dataset (CelebA).
http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html

In this data dataset there are 200K images with 40 different class labels and each image has a different background footprint and there are a lot of different variations making it difficult for a model to classify each label effectively class.

• Step2: Download the list_attr_celeba.txt file which contains the annotations for each image on the same site http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html
• Step3: Opening the file annotations you can see the 40 labels: 5_o_Clock_Shadow Arched_Eyebrows Attractive Bags_Under_Eyes Bald Bangs Big_Lips Big_Nose Black_Hair Blond_Hair Blurry Brown_Hair Bushy_Eyebrows Chubby Double_Chin Eyeglasses Goatee Gray_Hair Heavy_Makeup High_Cheekbones Male Mouth_Slightly_Open Mustache Narrow_Eyes No_Beard Oval_Face Pale_Skin Pointy_Nose Receding_Hairline Rosy_Cheeks Sideburns Smiling Straight_Hair Wavy_Hair Wearing_Earrings Wearing_Hat Wearing_Lipstick Wearing_Necklace Wearing_Necktie Young
For each of the images present in the test set there is a value of 1 or -1 specifying for image whether the class is present in the image.
The objective here will be to predict for each of the classes the probability of its presence in the image.
• Step4: you can load the notebook which is present on https://github.com/vatsalsaglani/MultiLabelClassifier/blob/master/prediction_celebA.ipynb

Here is the results:

Use of the imshow function (see previous example for single label prediction)

Architecture of the convolutional neuron network:

import torch.nn.functional as F
class MultiClassifier(nn.Module):
def __init__(self):
super(MultiClassifier, self).__init__()
self.ConvLayer1 = nn.Sequential(
nn.Conv2d(3, 64, 3), # 3, 256, 256
nn.MaxPool2d(2), # op: 16, 127, 127
nn.ReLU(), # op: 64, 127, 127
)
self.ConvLayer2 = nn.Sequential(
nn.Conv2d(64, 128, 3), # 64, 127, 127
nn.MaxPool2d(2), #op: 128, 63, 63
nn.ReLU() # op: 128, 63, 63
)
self.ConvLayer3 = nn.Sequential(
nn.Conv2d(128, 256, 3), # 128, 63, 63
nn.MaxPool2d(2), #op: 256, 30, 30
nn.ReLU() #op: 256, 30, 30
)
self.ConvLayer4 = nn.Sequential(
nn.Conv2d(256, 512, 3), # 256, 30, 30
nn.MaxPool2d(2), #op: 512, 14, 14
nn.ReLU(), #op: 512, 14, 14
nn.Dropout(0.2)
)
self.Linear1 = nn.Linear(512 * 14 * 14, 1024)
self.Linear2 = nn.Linear(1024, 256)
self.Linear3 = nn.Linear(256, 40)

def forward(self, x):
x = self.ConvLayer1(x)
x = self.ConvLayer2(x)
x = self.ConvLayer3(x)
x = self.ConvLayer4(x)
x = x.view(x.size(0), -1)
x = self.Linear1(x)
x = self.Linear2(x)
x = self.Linear3(x)
return F.sigmoid(x)

An overview of the network architecture:

### SUMMARY – Pytorch tutorial :

Here is a summary of my pytorch tutorial : sheet that I created to allow you to choose the right activation function and the right cost function more quickly according to your problem to be solved.

### Pytorch tutorial : THE 5 BEST DEEP LEARNING LINKS

https://atcold.github.io/pytorch-Deep-Learning/ : Dree courses Yann LeCun

https://pytorch.org/ : Official web site PyToch

https://machinelearningmastery.com : Advanced on deep learning

https://dataanalyticspost.com/

## Pytorch tutorial – internal links

http://128mots.com/index.php/en/category/non-classe-en/

http://128mots.com/index.php/category/python/

## PyTorch how to load a pre-trained model?

There are several approaches to recording (serializing) and loading (deserialize) patterns for inference in PyTorch.

For example you may need to load a model that is already trained and back up that comes from the internet. More recently I answered this question on a discussion forum https://discuss.pytorch.org/t/i-want-to-do-machine-learning-with-android/98753. I take advantage of this article to give some details.

In PyTorch you can save a model by storing in its file its state_dict these are Python dictionaries, they can be easily recorded, updated, modified and restored, adding great modularity to PyTorch models and optimizers.

In my example:

ESPNet is backed up with this method

https://github.com/sacmehta/ESPNet/tree/master/pretrained/encoder

I managed to load the encoder model using the ESPNet class that makes a load_state_dict

import torch
model = ESPNet(20,encoderFile="espnet_p_2_q_8.pth", p=2, q=8)
example = torch.rand(1, 3, 224, 224)
traced_script_module = torch.jit.trace(model, example)
traced_script_module.save("model.pt")


The exit is

Encode loaded!


Note that ESPNet uses the default GPU and that an adaptation in Model.py in https://github.com/sacmehta/ESPNet/tree/master/train was required by replacing:

self.encoder.load_state_dict(torch.load(encoderFile)


By:

self.encoder.load_state_dict(torch.load(encoderFile,map_location='cpu'))


Otherwise if you don’t make this adjustment don’t have the ability to launch on a GPU you’ll get the following error:

RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location-torch.device ('cpu') to map your storages to the CPU.

## BACKUP AND LOAD BY PYTORCH SCRIPT

https://pytorch.org/docs/stable/jit.html

TorchScript is a way to create models that can be made that can be optimized from the PyTorch code.

Any TorchScript program can be saved from a Python process and loaded into a process where there is no Python dependency.

The code below allows you to save the pre-trained ESPNet model that was backed up by the classic torch.save method via the use of TorchScript.

import torch

model = ESPNet(20,encoderFile="espnet_p_2_q_8.pth", p=2, q=8)
example = torch.rand(1, 3, 224, 224)
traced_script_module = torch.jit.trace(model, example)
#Exemple de save avec TorchScript
torch.jit.save(traced_script_module, "scriptmodel.pth")

An example for loader via TorchScript below:

import torch
model = torch.jit.load("scriptmodel.pth")

## RESOURCES ON THE SUBJECT

https://pytorch.org/docs/stable/jit.html

https://pytorch.org/tutorials/beginner/Intro_to_TorchScript_tutorial.html

https://stackoverflow.com/questions/53900396/what-are-torch-scripts-in-pytorch

## BCEWithLogitsLoss Pytorch with python

BCEWithLogitsLoss Here are some additional explanations on using Binary Cross-Entropy Loss with Pytorch in python.

## Introduction

I have been asked recently on How to find the code BCEWithLogitsLoss (https://discuss.pytorch.org/t/implementation-of-binary-cross-entropy/98715/2) for a function called in PYTORCH with the function handle_torch_function it takes me some time to understand it, and i will share it with you if it helps:

The question was about BCEWithLogitsLoss = BCELoss + sigmoid() ? My answer can be apply if you want to analyse the code of all the functions that figures in the ret dictionnary from https://github.com/pytorch/pytorch/blob/master/torch/overrides.py

BCEWithLogitsLoss is a combination of BCELOSS + a Sigmoid layer i. This is more numerically stable than using a plain Sigmoid followed by a BCELoss as, by combining the operations into one layer, it takes advantage of the log-sum-exp trick for numerical stability see : https://en.wikipedia.org/wiki/LogSumExp

## BCEWithLogitsLoss in details

1. The code of the BCEWithLogitsLoss Class can be found in https://github.com/pytorch/pytorch/blob/master/torch/nn/modules/loss.py
    def forward(self, input: Tensor, target: Tensor) -> Tensor:
return F.binary_cross_entropy_with_logits(input, target,
self.weight,
pos_weight=self.pos_weight,
reduction=self.reduction)


The F oject is imported from functionnal.py here : https://github.com/pytorch/pytorch/blob/master/torch/nn/functional.py

You will find the function called

def binary_cross_entropy_with_logits(input, target, weight=None, size_average=None,
reduce=None, reduction='mean', pos_weight=None):


It calls the handle_torch_function in https://github.com/pytorch/pytorch/blob/master/torch/overrides.py
You will find an entry of the function binary_cross_entropy_with_logits in the ret dictionnary wich contain every function that can be overriden in pytorch.
This is the Python implementation of torch_function

Then the code called is in the C++ File
https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/Loss.cpp

Tensor binary_cross_entropy_with_logits(const Tensor& input, const Tensor& target, const Tensor& weight, const Tensor& pos_weight, int64_t
...


## LIENS DE RESSOURCES :

https://pytorch.org/docs/stable/generated/torch.nn.BCEWithLogitsLoss.html

https://discuss.pytorch.org/t/bceloss-vs-bcewithlogitsloss/33586

https://stackoverflow.com/questions/59246416/the-output-of-numerical-calculation-of-bcewithlogitsloss-for-pytorch-example

https://www.semicolonworld.com/question/61244/implementing-bcewithlogitsloss-from-pytorch-in-keras

http://128mots.com/index.php/en/category/non-classe-en/

## Mastering IBM COBOL Language – Part 1 – FREE TRAINING

This article follows my introduction to the COBOL language. I advise you to reread the first part that deals with the general principles of COBOL. Here: http://128mots.com/index.php/2020/10/07/ibm-cobol/ and English http://128mots.com/index.php/en/2020/10/07/ibm-cobol-3/ (IBM COBOL FREE TRAINING)

Here we discuss the general concepts of the structure of a COBOL program.

## DATA DIVISION

All the data that will be used by the program is located in the Data Division. This is where all the memory allowances required by the program are taken care of. This division is optional.

## FILE MANAGEMENT:

FILE SECTION describes data sent or from the system, especially files.

The SYNtax in COBOL is the following

DATA DIVISION.
FILE SECTION.
FD/SD NameOfFile[RECORD CONTAINS intgr CHARACTERS]
[BLOCK CONTAINS intgr RECORDS]
[DATA RECORD IS NameOfRecord].
[RECORDING MODE IS {F/V/U/S}]

FD describes the files and SD the sorting files.

## FILE IN INPUT

In FILE-CONTROL the statement will be:

           SELECT FMASTER ASSIGN FMASTER
FILE STATUS W-STATUS-FMASTER.

If the input file is indexed:
           SELECT FMASTER ASSIGN FMASTER
ORGANIZATION IS INDEXED
RECORD KEY IS FMASTER-KEY
FILE STATUS W-STATUS-FMASTER.

In this case at the file-SECTION level we will have:

      FMAITRE as an appetizer
FD FMAITRE.
01 ENR-FMAITRE.
Statements from the registration areas

At the JCL level the statement will be of the form:

ENTREE DD DSN-SAMPLE. INPUTF,DISP-SHR

## FILE OUT

The JCL statement will then be:

OUTFILE DD DSN-SAMPLE. OUTPUTF,DISP(,CATLG,DELETE),
LRECL-150,RECFM-FB

RECFM specifies the characteristics of records with fixed length (F), variable length (V), variable length ASCII (D) or indefinite length (U). Records that are said to be blocked are described as FB, VB or DB.

## OPENING AND CLOSING FILE IN PROCEDURE DIVISION

COBOL uses PROCEDURE DIVISION mechanisms to perform writing, closing and file openings.

Entry file opening:

OPEN INPUT FICENT

Opening the output file:

OPEN OUTPUT FICSOR

File closure:

CLOSE FICENT
CLOSE FICSOR

File playback:

READ ACTENR
AT END MOVE 'O' TO LAST-RECORD


File writing

WRITE SOR-ENR

## EXTERNAL LINKS TO RESOURCES (IBM COBOL FREE TRAINING)

Below are some links that I found interesting that also deal with file management with COBOL.

Example of IBM file management: https://www.ibm.com/support/knowledgecenter/en/SS6SGM_5.1.0/com.ibm.cobol51.aix.doc/PGandLR/ref/rpfio13e.html

Tutorialspoint an article on file management https://www.tutorialspoint.com/cobol/cobol_file_handling.htm

COURSERA COBOL with VSCODE: https://www.coursera.org/lecture/cobol-programming-vscode/file-handling-YVlcf

MEDIUM an interesting article for COBOL beginners: https://medium.com/@yvanscher/7-cobol-examples-with-explanations-ae1784b4d576

Also on his blog: http://yvanscher.com/2018-08-01_7-cobol-examples-with-explanations–ae1784b4d576.html

Many example COBOL free tutorials and sample code: http://www.csis.ul.ie/cobol/examples/default.htm

GITHUB Awesome-cobol https://github.com/mickaelandrieu/awesome-cobol you

IBM COBOL FREE TRAINING

## How do I make a career in COBOL?

COBOL (COmmon Business Oriented Language) is a business-oriented language it is very suitable for data processing with high performance and precision, it is offered by IBM.

You can be sure that any purchases or withdrawals you make with your credit card start a COBOL program. Every day COBOL processes millions of transactions. From my point of view learning COBOL is a very interesting set of knowledge.

I have worked with this language for more than 10 years and I share here some notes that will probably allow you to get your hands on it.

The mainframe is modernizing with the ability to program with the latest tools such as the free code editor VsCode and zowe extension as well as Z Open editor, run in the cloud in environments such as Open Shift and integrate devops principles with tools such as Github, Jenkins, Wazi.

## COBOL Syntax

The SYNtax of COBOL is quite simple and is similar to the natural language in English.

The code is standardized according to columns that can describe 5 key areas.

Area sequence: specifies a sequence number of the line of code sometimes, sometimes blank

Indicator Area: May contain an indicator for example – to indicate that the line is a comment, D to indicate that the line runs only in debugging mode.

A AREA: Contains divisions, sections, paragraphs and Lift

B AREA: Sentences and statements of the program cobol for example COMPUTE something…

Area Identification: space to ignore and leave blank.

There are also words reserved in COBOL you will find on the link the list of words reserved in COBOL. https://www.ibm.com/support/knowledgecenter/SSZJPZ_9.1.0/com.ibm.swg.im.iis.ds.mfjob.dev.doc/topics/r_dmnjbref_COBOL_Reserved_Words.html

## Divisions:

The code is structured by divisions that contain Paragraph-composed Sections themselves made up of Sentences and Statements.

Example of sentences:

Note that the point corresponds to an implied scope terminator.

There are 4 Divisions in a COBOL program:

– DATA DIVISION: allows you to set up the data management that will be processed by the program.

– DIVISION IDENTIFICATION: Program name and programmer, program date, program purpose.

– QUARTER ENVIRONEMENT: Type of computer uses and mapping between files used in the program and dataset on the system (link between program and system)

– PROCEDURE DIVISION: This is where the business code composed of the different paragraphs to be executed is contained.

## The variables in Cobol:

As in other languages, letters can be used to represent values stored in memory.

The name of a variable is a maximum of 30 characters.

A Picture clause allows you to determine the type of variable.

PIC 9: Digital length is in brackets.

PIC 9(5): digital variable of 5 digits the maximum length for a digital is 18.

PIC A for a character

PIC X(11): an alphanumeric with a maximum length of 255

It is possible to have types edited using symbols:

PIC 9(6)V99 for a 6 digits and 2 decimals separated by comma.

PIC $9,999V99 to represent an amount Note that COBOL provides constant liters such as ZEROES, SPACE, SPACES, LOW-VALUE … More information on this link: https://www.ibm.com/support/knowledgecenter/SS6SG3_4.2.0/com.ibm.entcobol.doc_4.2/PGandLR/ref/rllancon.htm If you are new to IBM COBOL and want to do a serious and not too expensive apprenticeship, I recommend you read this book: This book covers a lot of topics related to machine language, IBM Cobol training, Open Cobol IDE, DB2 to become a true Cobol programmers. See also my articles: ## How did I prepare the PRINCE2® Practitioner certification? I have just obtained the PRINCE2® Practitionner certification. PRINCE2® is a project management methodology used around the world that can adapt to all types of projects. With an average salary of$84,450 for a certified project manager, (here), I have to say that PRINCE2® is from my point of view a very interesting skill set to get.

Learning PRINCE2® is a bit like saying “I want to work with the best method to manage and control projects with an international scope.»

I share here some notes and my feedback as well as resources, which you might be interested in if you are considering certifying PRINCE2®.

The first step in certification is the PRINCE2® Fundamental (PRINCE2® Foundation exam, which validates that you have the knowledge to participate in a project that uses the PRINCE2®The PRINC
E2® Practitioner (PRINCE2® Practitioner) exam aims for a perfect mastery of the method for managing and managing a project, the Fundamental exam is a prerequisite.

I would advise you to read the official book Managing Successful Projects with PRINCE2®

Available here:

and you are entitled to it during the PRINCE2® Practitioner exam.
You can choose to pass the certification in French or English. I advise you to pass it in English because in my opinion it gives a more international character to the certification. You will also have an additional 30 minutes to take the exam if you do not take it in your native language.

Here’s an interesting resource list I found online:

Free online exam simulator for PRINCE2® Foundation https://mplaza.training/exam-simulators/prince2-foundation/

Free online exam simulator for PRINCE2® Practitioner https://mplaza.training/exam-simulators/prince2-practitioner/

The official website PRINCE2®de Axelos: https://www.axelos.com/best-practice-solutions/prince2

If you are looking for a book in French I recommend this one:

Intellectual Property :

## Algorithm the problem of the backpack (Knapsack problem) in more than 128 words

The problem of the algorithmic backpack is interesting and is part of the first digital science and computer science program.

This problem illustrates the gluttonous algorithms that list all the possibilities of solving a problem to find the best solution.

The problem of the backpack is a problem of optimization, i.e. a function that must be maximized or minimized and constraints that must be met.

The problem of the backpack

For a backpack of maximum capacity of P and N items each with its own weight and a set value, throw the items inside the backpack so that the final contents have maximum value.

Example of a statement:

• Maximum backpack capacity: 11 units
• Item number: 5
• Object Values: \$10.50,20,30.60
• Weight of Objects: '1,5,3,2,4'

What is the maximum value that can be put in the backpack considering the maximum capacity constraint of the bag which is 11?

Gluttonous algorithm

An effective solution is to use a gluttonous algorithm. The idea is to calculate the value/weight ratio for each object and sort the object based on this calculated ratio.

You take the object with the highest ratio and add until you can't add any more.

In fractional version it is possible to add fractions of item to the backpack.

Implementation of the problem in non-fractional version

Here is an implementation of the problem in a non-fractional version, i.e. you can't add a fraction of an object to the bag. Only whole objects can be added.

itemsac class:
def __init__ (self, weight, value, index):
self.index - index
self.weight - weight
self.value
self.report - value // weight
#Fonction for comparison between two ObjectsSac
#On compares the ratio calculated to sort them
def __lt__ (self, other):
return self.report< other.rapport

def getValeurMax (weights, values, ability):
TableTrie[]
for i in range:
tableTrie.append (ObjectSac (weigh[i]ts, value[i]s, i))

#Trier the elements of the bag by their report
tableTrie.sort (reverse - True)

MeterValeur - 0
for object in tableauTrie:
WeightCourant - int (object.weight)
ValueCourante - int (object.value)
if capacity - weightsCourant '0:
#on adds the object to the bag
#On subtracts capacity
capacity - weightsCourant
MeterValeur - ValueCourante
#On adds value to the bag
return counterValeur

Weights[1,5,3,2,4]
Values[10,50,20,30,60]
capacity - 11
Maximum value - getValeurMax (weight, values, capacity)
print ("Maximum value in the backpack," ValueMax) 

The result is as follows:

py sacados.py
Maximum value in backpack - 120

Implementation of the problem in fractional version

In a fractional version of the backpack problem you can add fractions of object to the backpack.

itemsac class:
def __init__ (self, weight, value, index):
self.index - index
self.weight - weight
self.value
self.report - value // weight
#Fonction for comparison between two ObjectsSac
#On compares the ratio calculated to sort them
def __lt__ (self, other):
return self.report< other.rapport

def getValeurMax (weights, values, ability):
TableTrie[]
for i in range:
tableTrie.append (ObjectSac (weigh[i]ts, value[i]s, i))

#Trier the elements of the bag by their report
tableTrie.sort (reverse - True)

MeterValeur - 0
for object in tableauTrie:
WeightCourant - int (object.weight)
ValueCourante - int (object.value)
if capacity - weightsCourant '0:
#on adds the object to the bag
#On subtracts capacity
capacity - weightsCourant
MeterValeur - ValueCourante
#On adds value to the bag
else
fraction - capacity / weightCouring
MeterValeur -Courante value - fraction
capacity - int (capacity - (weightsCourant - fraction))
Break
return counterValeur

Weights[1,5,3,2,4]
Values[10,50,20,30,60]
capacity - 11
Maximum value - getValeurMax (weight, values, capacity)
print ("Maximum value in the backpack," ValueMax) 

The result is as follows:

py sacados.py
Maximum value in backpack - 140.0