Text

The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.

Monday, August 19, 2019

Numpy array multiplication


Dot product

  • Applies to one dimensional arrays, aka vectors.
  • The sum of the products of the components of the vectors.
  • The result supposedly represents “how similar the two vectors are”.
  • The first elements of each vector multiplied, then the second, third, etc.  Then add them all together.
  • Aka inner product
  • Aka scalar product.  Since the result is a scalar.
  • Notation is the two vector names next to each other with a superscripted T above the first.
  • In Numpy, the dot method of the first vector is called, passing the second vector as an argument. 

Hadamard product

  • Can do this on vectors of the same size.
  • Result is a vector also that same size.
  • The first elements of each vector multiplied and become the first element of the answer. Repeat for all elements.
  • Notation is a very small centered circle between the elements.
  • In Numpy, the * operator is used.

Matrix multiplication

  • Applies to arrays with dimensions higher than one.  
    • Technically when there is only one dimension in either or both arrays, it is the same process described here, but each row and/or column have only one element, so it can be simpler to think of it only in terms of the "Dot Product" description above.
  • In Numpy, a matrix will be an array of arrays
  • Can multiply two matrices A and B if the number of [rows, columns] in matrix A is equal to the number of [columns, rows].
  • Result is another matrix with the width of B and the height of A.
  • Each element of the result is a scalar.
  • Each element of the result is the dot product of corresponding rows of A with columns of B.
  • The first row of the result is the dot product of the first row of A with each of the columns of B.
  • The second row of the result is the dot product of the second row of A with each of the columns of B.
  • And so on.
  • Notation is the names of the two matrices next to each other.
  • In Numpy, the dot method of the first array is called, passing the second array as an argument.


Friday, August 16, 2019

Recommended process for process document distribution

Picture a "process" document that describes how a group of people will need to do some part of their ongoing work from that point forward.  At some point that group needs to become aware of it.  Such times include:
  • When it is first created.
  • When it is updated.
  • Introducing a new hire to it.

Should you send them the document?

No.  It's not the end of the world if you do, but, there is a better way.

Why not send the document?

Basically it wastes little slices of time multiplied by the number of recipients.
  • Everyone who receives it is now responsible for curating their own private copy every time a new revision is sent out.  This wasted time is multiplied every time another update is sent.
  • The document is just a snapshot of the actual document and will be out of date the moment any other modification is made.  There will always be doubt about whether any given copy is the most recent update, and possibly time wasted making sure.
  • New hires will need to obtain a copy.  Usually after confusion caused by not knowing that the document exists.  Usually wasting their time searching and the time of other people as they are forced to ask around about it.
  • Sending a copy of the document attached to a mass e-mail also wastes space in the e-mail system.
So then how to provide the document to the group?

Recommended Process

Get ready

Establish an official location for the document if one does not already exist.  If an effective broader official location for such documents does not exist, that is a larger problem to be solved separately, but if that is the case, don't let it stop you.  Work around the problem for now.  Pioneer a location.  Be a voice for positive ongoing improvement.  If this step is done properly, it will only need to be done the very first time.

Get set

Store the document there.  This shouldn't take much time.  Presumably you've saved the document "somewhere" before you want to send it, right?

Go

When attention is needed, send people of a LINK to the document along with any introduction or comments about what was just changed.

This process works even better if

  • The "official location" of the document is within a broader "official location" for all similar documents, and that location is in the form of something that the audience can be subscribed to receive notifications when something is updated.
  • New hires are given the location of the broader "official location", so they automatically have any new documents as they are added.

Monday, August 12, 2019

How to resolve linker error LNK2005

The C runtime libraries define the DllMain function used in all DLL components.  But MFC enabled DLL component projects also have an MFC specific version of this function that gets bound in by default.  So what happens when a project binds with both the DLL versions of the C runtime and the MFC libraries?  Since each of those has their own version of DllMain already defined, there can be a conflict.


LNK2005 _DllMain@12 already defined in MSVCRT.lib(dllmain.obj) mfcs100.lib(dllmodul.obj)

So how can this ever NOT be a problem?  Well, the C runtime DLL is smart enough to use trickerly to protect itself from there already being a DllMain defined at compile time.  This is good, as otherwise none of our "user" DLL components that define their own version of this function could ever bind with the C runtime.

This linker error can still be a problem though, because the MFC libraries do NOT protect themselves from the possibility that this function already exists.  They must think that anyone only ever wrote "pure" MFC components that would never include any other libraries or ever define such functions ourselves.

So to resolve this is a matter of making sure that the MFC libraries are bound first.  How can we do this?


  1. Remove the conflicting libraries from being included by searching the linker search paths in the default manner.
  2. Add the libraries explicitly in a specific order.
In the case of the specific error above, this resolution ends up looking like this:



The two libraries are ignored from among the "default" libraries.  Then they are added as "additional" dependencies instead.  Note that the MFC library is placed first in the Additional Dependencies, followed by the C runtime.  This must be done separately for release mode and debug mode configurations as the libraries are named slightly differently for each mode.

Saturday, August 10, 2019

Deep Learning Professional Certificate Program (edX)


Formal education makes a living, but self-education makes a fortune,” - Jim Rohn

Professional Certificate in Deep Learning

https://www.edx.org/professional-certificate/ibm-deep-learning

Due to the heavy use of Data Science when working on Deep Learning, I've decided take an entirely different certification course first...
    ·         oDeep Learning Fundamentals with Keras
o   Prerequisites:
§  Python Programming.
·        See prerequisites of next section for more Python related courses
·         https://www.edx.org/course/python-basics-for-data-science-2 þ
§  Machine Learning with Python. 
§  Partial Derivatives. edX recommended Khan Academy.  I also traded a sushi dinner to my son for some tutoring, where I posted my notes here and here.
·         https://www.youtube.com/watch?v=AXqhWeUEtQU Introduction þ
·         https://www.youtube.com/watch?v=kdMep5GUOBw Formal definition þ
·         https://www.youtube.com/watch?v=dfvnCHqzK54 And Graphs þ

·         oDeep Learning with Python and PyTorch
o   Prerequisites.  Here I elect to use Pluralsight courses I deemed similar to the edX courses.
§  Python & Jupyter notebooks
·         Python
·         Jupyter notebook
§  Machine Learning concepts
§  Deep Learning concepts

·         oDeep Learning with Tensorflow  
o   Prerequisites:
§  Same prerequisites as previous course.

·        oUsing GPUs to Scale and Speed-up Deep Learning  
o   Prerequisites:
§  No prerequisites.

·         oApplied Deep Learning Capstone Project  
o   Prerequisites:
§  Completed all courses in the Deep Learning Professional Certification Program  o


Partial Derivatives


Although this applies to higher dimensions too, this example sticks with a three variable equation because it can be visualized in three dimensional space.  It will look like a sheet that is tilted and/or bent in various ways.

Example equation



Graphed at geogebra



Partial derivative

This is essentially picking a point along one of the planes to get the equation for the line that is formed out of the sheet right there with respect to the other two planes.

Process

Starting with the original equation graphed above 

Pick a point along the Y axis.  In the graph above, the “sheet” crosses the Y axis at position 0, so using that it gives us this equation for the line that forms where the sheet passes that axis.





This is the slope along the z and x axes, it can be helpful to express it as 2/1.  If the image is turned to view edgewise along the sheet with respect to the z and x axes, this slope becomes observable:






























Derivatives

A two variable equation can be graphed on a simple two dimensional grid.  It will often be some sort of curved line.  The derivative is an equation that gives the slope of that line at a given point.

Example Equation


Can also be expressed as

Derivative


How was it derived?

The following power rule is applied to each term:


The way I think of it is that the “t” in the power rule is the entire term, which is usually a constant, a variable, and an exponent.

First term

The first term of the example equation is


Which when thought of as the “t” in the power rule with constant, variable, and exponent looks like this:



Applying the power rule means taking the exponent as the value “n”.  In this case 4.  Multiplying it by the front of the term is (4 x 1) or 4.  Changing the exponent to n-1 is (4-1) or 3.  So the first term derives to:


Second Term

Looking at the second term of the example equation and what it derives into shows the 3 being multiplied by the 2 and the exponent decreasing to 2.

Third Term

The third term is similar and becomes 8x.  The exponent was reduced to 1 so it need not be displayed anymore.

Fourth Term

The fourth term sees the exponent reducing down to zero, and since anything to the zeroth power is one, the x to the zeroth power becomes 1 and doesn’t need to be displayed anymore.  It just multiplies by the 5 and becomes the value 5.

Fifth Term

The fifth term is just a constant so like all constants, it derives to zero, so is effectively just dropped.  Why?  Because to get it into the “t” form for the power rule, it becomes:


With the starting exponent of zero, that means “n” is zero, and since the whole thing is going to be multiplied by “n” on the front, the whole term solves to zero.

Wednesday, August 7, 2019

Intentional Error

Sometimes it is difficult to test error handling.  Exceptions in different layers of the coding.  If an exception happens you want it to be logged, displayed, or otherwise findable.  If the exception handling doesn't properly make the exception details findable by the support analyst, they will have a tough time positively diagnosing the cause.

Yet to test the exception handling in every layer, one might need to sabotage various things.  Remove a file.  Lock down the permissions.  Temporarily remove an assembly.

Depending on the environment the test is conducted on, the ability to sabotage elements on the server may be difficult or impractical.  Especially for a QA tester that may not have access to the box, or the training to "correctly sabotage" elements on it.

One approach is to pick one of the free form fields of a data record, and establish a series of testing values.  Something easy enough to remember, but not prone to be added to real records.  Like the key words "Intentional Error" followed by a six digit number.

Different sections of the code that process that record can check the field for the key words and a unique number related to that code section.  If the key words and number are detected, the code immediately throws a defined "IntentionalErrorException".  The testing scripts include the key words and numbers to use, where the exception is to be reported, and key elements that must be included for the error to be useful to support analysts.