Tutorials – S3lab http://s3lab.deusto.es S3lab Security Blog Wed, 06 May 2020 12:51:35 +0000 en-US hourly 1 https://wordpress.org/?v=5.1.5 HTTPS and Let’s Encrypt, protect your communications! http://s3lab.deusto.es/https-lets-encrypt-protect-communications/ Wed, 16 Dec 2015 10:57:49 +0000 http://s3lab.deusto.es/?p=7617 We must protect the system somehow. We have spoken before of how to protect web applications, configure a router or directly IoT systems. Today what we see is how to protect as much as possible communications with all these devices, using

The post HTTPS and Let’s Encrypt, protect your communications! appeared first on S3lab.

]]>
We must protect the system somehow. We have spoken before of how to protect web applications, configure a router or directly IoT systems. Today what we see is how to protect as much as possible communications with all these devices, using SSL / TLS.

The post HTTPS and Let’s Encrypt, protect your communications! appeared first on S3lab.

]]>
WEKA, from data to information (Part III) http://s3lab.deusto.es/weka-from-data-to-information-part-3/ Tue, 16 Jun 2015 09:57:24 +0000 http://s3lab.deusto.es/?p=3996 We return with a new article on WEKA. Following the introduction on how to prepare the environment and how use some basic functions, and the second part in which we tried to dig a little deeper into this tool, now it

The post WEKA, from data to information (Part III) appeared first on S3lab.

]]>
We return with a new article on WEKA. Following the introduction on how to prepare the environment and how use some basic functions, and the second part in which we tried to dig a little deeper into this tool, now it do some tests to classify texts. And what use is it to analyze text will ask some, as we are this kind of technology in our day, when we receive suggestions on products that might interest us or, conversely, fail to receive that so annoying spam to automatically classify documents or when you simply perform a search in our favorite search engine.

The post WEKA, from data to information (Part III) appeared first on S3lab.

]]>
WEKA, from data to information (Part II) http://s3lab.deusto.es/weka-from-data-to-information-part-ii/ Tue, 10 Mar 2015 10:57:07 +0000 http://s3lab.deusto.es/?p=3447 To start, let’s download a new dataset with features wine production: wine.arff. To view the content, we can open it directly and select Weka as the default tool or within the Explorer section. If we look at the attributes of the

The post WEKA, from data to information (Part II) appeared first on S3lab.

]]>
To start, let’s download a new dataset with features wine production: wine.arff. To view the content, we can open it directly and select Weka as the default tool or within the Explorer section. If we look at the attributes of the dataset, we can see “class”, the class of each of the component parts.

The post WEKA, from data to information (Part II) appeared first on S3lab.

]]>
Bloom Filter, an algorithm like an Hasbro game http://s3lab.deusto.es/bloom-filter-algorithm-hasbro-game/ Tue, 24 Feb 2015 11:00:54 +0000 http://s3lab.deusto.es/?p=3357 00If you ever used Cassandra, you know that is well known at being extremely fast in writes and reads, and that is due in part to a data structure called Bloom Filter. Bloom Filter is an extremely efficient way of

The post Bloom Filter, an algorithm like an Hasbro game appeared first on S3lab.

]]>
00If you ever used Cassandra, you know that is well known at being extremely fast in writes and reads, and that is due in part to a data structure called Bloom Filter.

Bloom Filter is an extremely efficient way of asking if a data exists in a set or not (which Cassandra uses to avoid useless disk accesses, which is the slowest part). What’s the downside? It is a probabilistic algorithm and may have false positive (although never false negatives).

It consists of two elements:

  • An array of m bits, initialized to zero.
  • A set of k hash functions that, given some data, generate numbers between 0 and m-1.

When we insert data it will go through the hash functions, which will return array positions where we change their values to 1. So, whenever new data arrives, to know for sure that the item does not exist in the set we just have to check that any of the positions in the array generated by the hash functions returns 0.

Let’s try a little game to understand this.

Imagine a game of “Guess who?”, but with multiple suspects. Player A adds suspects, and Player B asks Player A questions based on the attributes of the suspects. In this case the suspects will be the data, the attributes hash functions, and the total list of attributes that keep the player A is the array of bits. These will be our possible suspects:

Hasbro-S3lab-en

Suppose that player A adds as a suspect Bill Gates. This now means that Player A has the attributes “Rich”, “Philanthropist” and “Glasses”. Therefore, when the Player B asks for Steve Jobs, although the properties of “Glasses” and “Rich” are on the list, “American” isn’t, so we can be sure that Steve Jobs is not a suspect.

But what if we now add Mark Zuckerberg? In doing so, we add two new properties, so now our attributes list is as follows:

“Rich”, “Philanthropist”, “Glasses”, “American” and “Young”.

So, now, what happens if we ask for Steve Jobs’s attributes? Because all his attributes are listed, we could think that he is one of the suspects, but is not the case. So we can be sure when someone is not a suspect, but never it is. If we add Steve Jobs to our list it will not be modified. So what happens if we ask Stephen Hawking? Even with three suspects already, because Stephen Hawking has two attributes that have not yet been inserted, we can be sure that he is not one of the suspects.

And that’s how it works (on the surface). If you want more information in depth, like for example how to reduce the false positives rate, or how to deal with deletions here’s an interesting page.

Why Bloom filters work the way they do

And if you feel like playing a bit with it, on this website you can find several implementations in python, that may be useful when you want to handle large data such as … data from the Titanic

Fast Non-Standard Data Structures for Python

The post Bloom Filter, an algorithm like an Hasbro game appeared first on S3lab.

]]>
WEKA, from data to information (Part I) http://s3lab.deusto.es/weka-from-data-to-information-part-1/ Tue, 03 Feb 2015 11:02:05 +0000 http://s3lab.deusto.es/?p=3126 We start with the basics, installation and familiarization with the tool, and a simple example to see its possibilities. First, we download WEKA from the website of the University of Waikato in New Zealand. Within its download section, find the version

The post WEKA, from data to information (Part I) appeared first on S3lab.

]]>
We start with the basics, installation and familiarization with the tool, and a simple example to see its possibilities. First, we download WEKA from the website of the University of Waikato in New Zealand. Within its download section, find the version for developers and thus have the latest version (3-7).

The post WEKA, from data to information (Part I) appeared first on S3lab.

]]>
Starting as data scientist http://s3lab.deusto.es/starting-data-scientist/ Tue, 13 Jan 2015 11:38:12 +0000 http://s3lab.deusto.es/?p=3051 Terabytes of data stored everywhere, in various places, to everyone. Which is the reason of it, if we don’t use them for anything? Let’s start analysing.  You can predict the future with them and  understand  situations that happened long ago, it’s

The post Starting as data scientist appeared first on S3lab.

]]>
Terabytes of data stored everywhere, in various places, to everyone. Which is the reason of it, if we don’t use them for anything? Let’s start analysing.  You can predict the future with them and  understand  situations that happened long ago, it’s a really exciting science.

The post Starting as data scientist appeared first on S3lab.

]]>