Author Topic: Microsoft Cognitive Toolkit (CNTK) with ASMC  (Read 346 times)

LiaoMi

  • Member
  • ****
  • Posts: 846
Microsoft Cognitive Toolkit (CNTK) with ASMC
« on: March 28, 2021, 02:58:26 AM »
Hi Nidud  :azn:,

is there a small chance of adapting this CNTK product for use with assembly language programming?

Microsoft Cognitive Toolkit (CNTK)
https://github.com/Microsoft/CNTK

The Microsoft Cognitive Toolkit (https://cntk.ai) is a unified deep learning toolkit that describes neural networks as a series of computational steps via a directed graph. In this directed graph, leaf nodes represent input values or network parameters, while other nodes represent matrix operations upon their inputs. CNTK allows users to easily realize and combine popular model types such as feed-forward DNNs, convolutional nets (CNNs), and recurrent networks (RNNs/LSTMs). It implements stochastic gradient descent (SGD, error backpropagation) learning with automatic differentiation and parallelization across multiple GPUs and servers. CNTK has been available under an open-source license since April 2015. It is our hope that the community will take advantage of CNTK to share ideas more quickly through the exchange of open source working code.

Download
CNTK for Windows v.2.7 CPU only - https://cntk.ai/dlwc-2.7.html
CNTK for Windows v.2.7 GPU - https://cntk.ai/dlwg-2.7.html

Linux etc - https://github.com/microsoft/CNTK/releases

Docs - https://github.com/microsoft/CNTK/tree/master/Documentation/Documents
MS Docs MSDN - https://docs.microsoft.com/en-us/cognitive-toolkit/
TensorFlow Meets Microsoft's CNTK - https://www.researchgate.net/profile/Dennis_Gannon/publication/293652558_TensorFlow_Meets_Microsoft's_CNTK/links/56ba3a5208ae2567351ebbe8/TensorFlow-Meets-Microsofts-CNTK.pdf

Books -
An Introduction to Computational Networks and the Computational Network Toolkit
https://www.microsoft.com/en-us/research/wp-content/uploads/2014/08/CNTKBook-20160217.pdf

Introduction to CNTK Succinctly
https://s3.amazonaws.com/ebooks.syncfusion.com/downloads/cntk_succinctly/cntk_succinctly.pdf?AWSAccessKeyId=AKIAWH6GYCX3T2N7V5N5&Expires=1616870448&Signature=1s1alLHRC6Psp6ORIXIkUhd9Tlg%3D
https://www.syncfusion.com/succinctly-free-ebooks/cntk-succinctly/getting-started

C++ API - https://docs.microsoft.com/en-us/cognitive-toolkit/CNTK-Library-API
The CNTK Library C++ API exposes CNTK's core computational, neural network composition & training, efficient data reading, and scalable model training facilities for developers. The C++ APIs are fully featured for both model training as well as evaluation, allowing for both training and model serving to be driven from native code. This enables your native code to tune the online model using new data as part of a evaluation request (i.e. online learning).

Currently the best source of API documentation is inline in the API header file (CNTKLibrary.h) that contains the full C++ API definition. The API header files are also included in the binary release package under the Include directory.

Use C# and a CNTK Neural Network To Predict House Prices In California - https://medium.com/machinelearningadvantage/use-c-and-a-cntk-neural-network-to-predict-house-prices-in-california-f776220916ba
Building deep neural networks in C# has never been easier!
CNTK or the Cognitive Toolkit is Microsoft’s tensor library and answer to TensorFlow. It can build, train, and run many types of deep neural networks.
CNTK is fully compatible with .NET and C#. There’s no need to use Python, you can easily tap into the awesome power of deep neural networks using any NET language, including C#.
CNTK is super easy to use. Watch this, I’m going to build an app that can predict house prices in California using a deep neural network.
The first thing we need is a data file with house prices. The 1990 California cencus has exactly what we need.
If you want to follow along, download the California 1990 housing census and save it as california_housing.csv in your project folder.

https://github.com/mdfarragher/DLR/blob/master/Regression/HousePricePrediction/california_housing.csv

« Last Edit: March 28, 2021, 11:41:59 PM by LiaoMi »

nidud

  • Member
  • *****
  • Posts: 2147
    • https://github.com/nidud/asmc
Re: Microsoft Cognitive Toolkit (CNTK) with ASMC
« Reply #1 on: March 28, 2021, 04:34:52 AM »
is there a small chance of adapting this CNTK product for use with assembly language programming?

Yes, that is possible and some of these (or similar) constructs have been tested in sample section.

Here's one of the inline DataType functions form CNTKLibrary.h.

Code: [Select]
inline const char* DataTypeName(DataType dataType)
    {
        if (dataType == DataType::Float)
            return "Float";
        else if (dataType == DataType::Double)
            return "Double";
        else if (dataType == DataType::Float16)
            return "Float16";
        else if (dataType == DataType::Int8)
            return "Int8";
        else if (dataType == DataType::Int16)
            return "Int16";
        else
            LogicError("Unknown DataType.");
    }

The Asmc version using LIBC output will be something like this.
Code: [Select]
; build: asmc64 -pe DataTypeName.asm

include stdio.inc

DataTypeName proto dataType:abs {
    ifidn typeid(?, dataType),<?real4>
        lea rax,@CStr("Float")
    elseifidn typeid(?, dataType),<?real8>
        lea rax,@CStr("Double")
    elseifidn typeid(?, dataType),<?real2>
        lea rax,@CStr("Float16")
    elseifidn typeid(?, dataType),<?byte>
        lea rax,@CStr("Int8")
    elseifidn typeid(?, dataType),<?word>
        lea rax,@CStr("Int16")
    else
        printf("Unknown DataType.\n")
        xor eax,eax
    endif
    }

    .code

main proc

  local double:real8

    .if DataTypeName(double)

        printf("DataTypeName: %s\n", rax)
    .endif
    ret

main endp

    end main

And the code produced.
Code: [Select]
main    PROC
        push    rbp                                     ; 0000 _ 55
        mov     rbp, rsp                                ; 0001 _ 48: 8B. EC
        sub     rsp, 48                                 ; 0004 _ 48: 83. EC, 30
        lea     rax, [DS0000]                           ; 0008 _ 48: 8D. 05, 00000000(rel)
        test    rax, rax                                ; 000F _ 48: 85. C0
        jz      ?_001                                   ; 0012 _ 74, 0F
        mov     rdx, rax                                ; 0014 _ 48: 8B. D0
        lea     rcx, [DS0001]                           ; 0017 _ 48: 8D. 0D, 00000000(rel)
        call    printf                                  ; 001E _ E8, 00000000(rel)
?_001:  leave                                           ; 0023 _ C9
        ret                                             ; 0024 _ C3
main    ENDP

DS0000  label byte
        db 44H, 6FH, 75H, 62H, 6CH, 65H, 00H            ; 0000 _ Double.

DS0001  label byte
        db 44H, 61H, 74H, 61H, 54H, 79H, 70H, 65H       ; 0007 _ DataType
        db 4EH, 61H, 6DH, 65H, 3AH, 20H, 25H, 73H       ; 000F _ Name: %s
        db 0AH, 00H                                     ; 0017 _ ..

LiaoMi

  • Member
  • ****
  • Posts: 846
Re: Microsoft Cognitive Toolkit (CNTK) with ASMC
« Reply #2 on: March 28, 2021, 05:27:04 AM »
@nidud

In the previous post I attached the original header files for compilation (CNTKLibrary.h, CNTKLibraryInternals.h, Eval.h), they are not large, but very complicated as it seemed to me. And most importantly, there are a lot of classes, I remember that you create all the classes by hand?

LiaoMi

  • Member
  • ****
  • Posts: 846
Re: Microsoft Cognitive Toolkit (CNTK) with ASMC
« Reply #3 on: March 29, 2021, 12:46:21 AM »
Basic example Xor Deep Learning  :thup:

https://github.com/aidevnn/XorCNTKcpp
https://github.com/aidevnn/XorCNTKcpp/archive/refs/heads/master.zip

XorCNTKcpp
XOR dataset on a simple MLP with CNTK2.7 in C++ and GPU with CUDA10.0

Assuming you have the CNTK2.7-GPU distribution in C:\cntk-gpu-2.7 Then you can build the project.

The Output :

Code: [Select]
XOR dataset CNTK!!! Device : GPU[0] GeForce 930MX

Start Training File ...
Minibatch Epoch:     0    loss = 0.704330    acc = 0.50
Minibatch Epoch:    50    loss = 0.673318    acc = 0.50
Minibatch Epoch:   100    loss = 0.604436    acc = 0.50
Minibatch Epoch:   150    loss = 0.491982    acc = 0.50
Minibatch Epoch:   200    loss = 0.321270    acc = 1.00
Minibatch Epoch:   250    loss = 0.134061    acc = 1.00
Minibatch Epoch:   300    loss = 0.066452    acc = 1.00
Minibatch Epoch:   350    loss = 0.040356    acc = 1.00
Minibatch Epoch:   400    loss = 0.027866    acc = 1.00
Minibatch Epoch:   450    loss = 0.020855    acc = 1.00
Minibatch Epoch:   500    loss = 0.016466    acc = 1.00
Minibatch Epoch:   550    loss = 0.013498    acc = 1.00
Minibatch Epoch:   600    loss = 0.011376    acc = 1.00
Minibatch Epoch:   650    loss = 0.009792    acc = 1.00
Minibatch Epoch:   700    loss = 0.008569    acc = 1.00
Minibatch Epoch:   750    loss = 0.007601    acc = 1.00
Minibatch Epoch:   800    loss = 0.006816    acc = 1.00
Minibatch Epoch:   850    loss = 0.006168    acc = 1.00
Minibatch Epoch:   900    loss = 0.005626    acc = 1.00
Minibatch Epoch:   950    loss = 0.005166    acc = 1.00
Minibatch Epoch:  1000    loss = 0.004771    acc = 1.00
End Training File ...

Prediction
[0 0] = 0 ~ 0.001827
[0 1] = 1 ~ 0.994444
[1 0] = 1 ~ 0.995401
[1 1] = 0 ~ 0.007024


Code: [Select]
// XorCNTKcpp.cpp : This file contains the 'main' function. Program execution begins and ends there.
//

#include "pch.h"
#include <algorithm>
#include <iostream>
#include <string>
#include <random>
#include "CNTKLibrary.h"
#include "CNTKLibraryC.h"
#include <stdio.h>

using namespace CNTK;

inline FunctionPtr FullyConnectedLinearLayer(Variable input, size_t outputDim, const DeviceDescriptor& device, const std::wstring& outputName = L"", unsigned long seed = 1)
{
assert(input.Shape().Rank() == 1);
size_t inputDim = input.Shape()[0];

auto timesParam = Parameter({ outputDim, inputDim }, DataType::Float, GlorotUniformInitializer(DefaultParamInitScale,
SentinelValueForInferParamInitRank, SentinelValueForInferParamInitRank, seed), device, L"timesParam");
auto timesFunction = Times(timesParam, input, L"times");

auto plusParam = Parameter({ outputDim }, 0.0f, device, L"plusParam");
return Plus(plusParam, timesFunction, outputName);
}

inline FunctionPtr CreateModel(Variable input, size_t hiddenLayers, size_t outputDim, const DeviceDescriptor& device, const std::wstring& outputName = L"", unsigned long seed = 1)
{
auto dense1 = FullyConnectedLinearLayer(input, hiddenLayers, device, L"inputLayer", seed);
auto tanhActivation = Tanh(dense1, L"hiddenLayer");
auto dense2 = FullyConnectedLinearLayer(tanhActivation, outputDim, device, L"outputLayer", seed);
auto model = Sigmoid(dense2, outputName);

return model;
}

inline TrainerPtr CreateModelTrainer(FunctionPtr model, Variable input, Variable label)
{
auto trainingLoss = BinaryCrossEntropy(Variable(model), label, L"lossFunction");
auto prediction = ReduceMean(Equal(label, Round(Variable(model))), Axis::AllAxes()); // Keras accuracy metric

auto learningRatePerSample = TrainingParameterSchedule<double>(0.1, 1);
auto parameterLearner = SGDLearner(model->Parameters(), learningRatePerSample);
auto trainer = CreateTrainer(model, trainingLoss, prediction, { parameterLearner });

return trainer;
}

inline void PrintTrainingProgress(TrainerPtr trainer, int minibatchIdx, int outputFrequencyInMinibatches)
{
if ((minibatchIdx % outputFrequencyInMinibatches) == 0 && trainer->PreviousMinibatchSampleCount() != 0)
{
float trainLossValue = (float)trainer->PreviousMinibatchLossAverage();
float evaluationValue = (float)trainer->PreviousMinibatchEvaluationAverage() * trainer->PreviousMinibatchSampleCount();

char buffer[70];
sprintf_s(buffer, 70, "Minibatch Epoch: %5d    loss = %8.6f    acc = %4.2f", minibatchIdx, trainLossValue, evaluationValue);
std::cout << std::string(buffer) << std::endl;
}
}

inline void TrainFromMiniBatchFile(TrainerPtr trainer, Variable input, Variable label, const DeviceDescriptor& device, int epochs = 1000, int outputFrequencyInMinibatches = 50)
{
int i = 0;
int epochs0 = epochs;

const size_t inputDim = 2;
const size_t numOutputClasses = 1;
auto featureStreamName = L"features";
auto labelsStreamName = L"labels";

auto minibatchSource = TextFormatMinibatchSource(L"XORdataset.txt", { {featureStreamName, inputDim}, {labelsStreamName, numOutputClasses} }, MinibatchSource::InfinitelyRepeat, true);
auto featureStreamInfo = minibatchSource->StreamInfo(featureStreamName);
auto labelStreamInfo = minibatchSource->StreamInfo(labelsStreamName);

std::cout << std::endl << "Start Training File ..." << std::endl;

while (epochs0 >= 0)
{
auto minibatchData = minibatchSource->GetNextMinibatch(4, device);

trainer->TrainMinibatch({ { input, minibatchData[featureStreamInfo] }, { label, minibatchData[labelStreamInfo] } }, device);
PrintTrainingProgress(trainer, i++, outputFrequencyInMinibatches);

if (std::any_of(minibatchData.begin(), minibatchData.end(), [](const std::pair<StreamInformation, MinibatchData> & t) -> bool { return t.second.sweepEnd; }))
epochs0--;
}

std::cout << "End Training File ..." << std::endl;
}

inline void TrainFromArray(TrainerPtr trainer, Variable input, Variable label, const DeviceDescriptor& device, int epochs = 1000, int outputFrequencyInMinibatches = 50)
{
int i = 0;
int epochs0 = epochs;

std::vector<float> dataIn{ 0.f, 0.f, 0.f, 1.f, 1.f, 0.f, 1.f, 1.f };
std::vector<float> dataOut{ 0.f, 1.f, 1.f, 0.f };

std::unordered_map<Variable, MinibatchData> miniBatch;
miniBatch[input] = MinibatchData(Value::CreateBatch(input.Shape(), dataIn, device, true), 4, 4, false);
miniBatch[label] = MinibatchData(Value::CreateBatch(label.Shape(), dataOut, device, true), 4, 4, false);

std::cout << std::endl << "Start Training Array ..." << std::endl;

while (epochs0 >= 0)
{
trainer->TrainMinibatch(miniBatch, device);
PrintTrainingProgress(trainer, i++, outputFrequencyInMinibatches);
epochs0--;
}

std::cout << "End Training Array ..." << std::endl;
}

inline  void TestPrediction(FunctionPtr model, const DeviceDescriptor& device)
{
std::cout << std::endl << "Prediction" << std::endl;

auto inputVar = model->Arguments()[0];
std::unordered_map<Variable, ValuePtr> inputDataMap;
std::vector<float> dataIn{ 0.f, 0.f, 0.f, 1.f, 1.f, 0.f, 1.f, 1.f };
auto inputVal = Value::CreateBatch(inputVar.Shape(), dataIn, device);
inputDataMap[inputVar] = inputVal;

auto outputVar = model->Output();
std::unordered_map<Variable, ValuePtr> outputDataMap;
outputDataMap[outputVar] = nullptr;

model->Evaluate(inputDataMap, outputDataMap, device);
auto outputVal = outputDataMap[outputVar];

std::vector<std::vector<float>> inputData;
std::vector<std::vector<float>> outputData;
inputVal->CopyVariableValueTo(inputVar, inputData);
outputVal->CopyVariableValueTo(outputVar, outputData);

for (int k = 0; k < 4; ++k)
{
auto in0 = inputData[k];
auto out0 = outputData[k];
char buffer[50];
sprintf_s(buffer, 50, "[%d %d] = %d ~ %8.6f", (int)(in0[0]), (int)(in0[1]), (int)round(out0[0]), out0[0]);
std::cout << std::string(buffer) << std::endl;
}
}

int main()
{
std::mt19937_64 rng(0);
rng.seed(time(0));
auto seed = (unsigned long)rng() % 10000;

auto device = DeviceDescriptor::GPUDevice(0);
//auto device = DeviceDescriptor::CPUDevice();
std::wstring ws = device.AsString();
std::wcout << "XOR dataset CNTK!!! Device : " << ws << std::endl;

const size_t inputDim = 2;
const size_t hiddenLayers = 8;
const size_t numOutputClasses = 1;

auto input = InputVariable({ inputDim }, DataType::Float, L"features");
auto label = InputVariable({ numOutputClasses }, DataType::Float, L"labels");

auto MLPmodel = CreateModel(input, hiddenLayers, numOutputClasses, device, L"MLPmodel", seed);
auto MLPtrainer = CreateModelTrainer(MLPmodel, input, label);

TrainFromMiniBatchFile(MLPtrainer, input, label, device);
//TrainFromArray(MLPtrainer, input, label, device);

TestPrediction(MLPmodel, device);
}

LiaoMi

  • Member
  • ****
  • Posts: 846
Re: Microsoft Cognitive Toolkit (CNTK) with ASMC
« Reply #4 on: March 29, 2021, 12:50:22 AM »
Assembler listing with comments, I'm not sure if this can be implemented in assembler  :undecided:

nidud

  • Member
  • *****
  • Posts: 2147
    • https://github.com/nidud/asmc
Re: Microsoft Cognitive Toolkit (CNTK) with ASMC
« Reply #5 on: March 29, 2021, 01:07:25 AM »
they are not large

They depend on a rather large template library so you need to take that into account.

Quote
but very complicated as it seemed to me.

They are essentially a collection of type functionality that imply you have a template for each type. The idea is then that they have the same functions so you may use them without knowing the type.

SafeRelease proto T:abs {
    .if T
        T.Relese()
        mov T,NULL
    .endif
    }

Quote
And most importantly, there are a lot of classes, I remember that you create all the classes by hand?

Asmc has a .CLASS and .TEMPLATE directive so it's possible to create a static class (COM) object.

C++

    Application.h
    Application.cpp

Asmc

    Application.inc
    Application.asm